You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Talk:SRE/business case/Disposable Development Environment: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>JHathaway
imported>JHathaway
Line 2: Line 2:


jbond, can you expand I why this would probably make pontoon incompatible with WMCS? [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 23:00, 29 July 2022 (UTC)
jbond, can you expand I why this would probably make pontoon incompatible with WMCS? [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 23:00, 29 July 2022 (UTC)
:i have tried to clarify but ill expand a bit here.  in pontoon we need to disable a bunch of puppet classes that are incomparable with the WMCS base puppet policy.  The main one that comes to mind is the admin module as WMCS uses ldap authentication.  maintaining compatibility with WMCS whilst trying to replicate production as close as possible seems like a counter-intuitive goal to and has more been born out of working with what wee have.  Further to this by maintaining the WMCS compatibility we ensure that pontoon only works with WMCS, needing additional work to be compatible with a more classic vanilla cloud environments e.g. GCP, AWS etc. [[User:Jbond|Jbond]] ([[User talk:Jbond|talk]]) 09:14, 1 August 2022 (UTC)
::thanks jbond, that helps, I tweaked the clarification a bit. [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 16:36, 1 August 2022 (UTC)


== Owner? ==
== Owner? ==
Line 9: Line 12:
methodologies without an owner tend to degrade over time and require
methodologies without an owner tend to degrade over time and require
significant effort to keep running smoothly. [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 23:00, 29 July 2022 (UTC)
significant effort to keep running smoothly. [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 23:00, 29 July 2022 (UTC)
:I agree however I'm curious on what @[[User:Jobo|Jobo]] thinks as they created the business case template.  that said Im not sure i have an answer to who the owner would be [[User:Jbond|Jbond]] ([[User talk:Jbond|talk]]) 09:17, 1 August 2022 (UTC)


== Physical Environment? ==
== Physical Environment? ==


I think it would be helpful to a Physical Environment as another option. Though it be a tremendous amount of work, and has significant downsides, it does provide features that are impossible to obtain with any of the other solutions, e.g. test a ganeti upgrade on real hardware, or test our re-imaging process. [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 23:02, 29 July 2022 (UTC)
I think it would be helpful to a Physical Environment as another option. Though it be a tremendous amount of work, and has significant downsides, it does provide features that are impossible to obtain with any of the other solutions, e.g. test a ganeti upgrade on real hardware, or test our re-imaging process. [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 23:02, 29 July 2022 (UTC)
:i think physical hardware environments are useful however im not sure it fits with this business case, specifically as its name "Disposable ...".  I see this case more about creating the tooling and shared services for development environments as opposed to actually creating the environments.  i also think that we should focus this proposal strictly on the development process and not things like staging or performance testing
I think it would be nice if the tooling could also spin up a development environment but my gut feeling is that would add significant amount of work to support, as a side not you can generally ask dc-ops for some spare hardware if you need to test something on physical hardware [[User:Jbond|Jbond]] ([[User talk:Jbond|talk]]) 09:21, 1 August 2022 (UTC)
:Perhaps it would be helpful to add an out of scope section, which acknowledges that a physical staging environment would have unique benefits, but this proposal is targeting the many other development needs that can be served without having a duplicate physical environment? [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 16:45, 1 August 2022 (UTC)


== Sample Development Tasks ==
== Sample Development Tasks ==
Line 42: Line 53:
## Container Deployments
## Container Deployments
## Cloud Deployments [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 23:07, 29 July 2022 (UTC)
## Cloud Deployments [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 23:07, 29 July 2022 (UTC)
#:My personal view is that network automation changes should be very much out of scope of this.  I think @[[User:Cathal Mooney|Cathal Mooney]] has been playing with a tool similar to GNS3 which would probably be better for network automation testing.
#:For "Adding config parameters to Apache" can you expand on why a physical environment is better then a disposable cloud environment.  similar why is a physical environment the most preferred option for exim changes? My general view is that the cloud environment should be as close to production as to be no different from physical (other then performance).  Containers ewill likley have some differences but we should aim for them to be small and known.  so the difference between cloud vs container becomes a choice between accuracy (VM's) vs speed (containers)  [[User:Jbond|Jbond]] ([[User talk:Jbond|talk]]) 09:26, 1 August 2022 (UTC)
#::I think that is okay if networking automation testing is outside of the scope of this proposal, but I think it would be valuable to note that in the doc. [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 16:46, 1 August 2022 (UTC)
#::Apologies for the lack of clarity in my topic. I think of a physical environment as having the highest fidelity to production in comparison to other methods. That high fidelity makes it easier in some instances to have confidence that a change will behave as you expect in production. For example exim, or rather email in general relies on external systems, public dns, public internet, public mail servers. Ideally a physical environment would connect with those systems in a similar manner as production and provide confidence that a change, say to how we perform SPF checks in exim, will work as expected in production. Hopefully a cloud or container based solution would allow you to mock out some of those interactions, but I don't think it is an easy problem to solve. The apache config case is less compelling, as I agree with you that a cloud environment or container environment could get very close to production, but it is still nice to know that your curl command is hitting almost the exact same stack as it will in production. [[User:JHathaway|JHathaway]] ([[User talk:JHathaway|talk]]) 16:56, 1 August 2022 (UTC)

Revision as of 16:56, 1 August 2022

5.2.5

jbond, can you expand I why this would probably make pontoon incompatible with WMCS? JHathaway (talk) 23:00, 29 July 2022 (UTC)

i have tried to clarify but ill expand a bit here. in pontoon we need to disable a bunch of puppet classes that are incomparable with the WMCS base puppet policy. The main one that comes to mind is the admin module as WMCS uses ldap authentication. maintaining compatibility with WMCS whilst trying to replicate production as close as possible seems like a counter-intuitive goal to and has more been born out of working with what wee have. Further to this by maintaining the WMCS compatibility we ensure that pontoon only works with WMCS, needing additional work to be compatible with a more classic vanilla cloud environments e.g. GCP, AWS etc. Jbond (talk) 09:14, 1 August 2022 (UTC)
thanks jbond, that helps, I tweaked the clarification a bit. JHathaway (talk) 16:36, 1 August 2022 (UTC)

Owner?

I think an important section should be added about whose responsibility it would be to maintain whatever solution is developed. Development environment methodologies without an owner tend to degrade over time and require significant effort to keep running smoothly. JHathaway (talk) 23:00, 29 July 2022 (UTC)

I agree however I'm curious on what @Jobo thinks as they created the business case template. that said Im not sure i have an answer to who the owner would be Jbond (talk) 09:17, 1 August 2022 (UTC)

Physical Environment?

I think it would be helpful to a Physical Environment as another option. Though it be a tremendous amount of work, and has significant downsides, it does provide features that are impossible to obtain with any of the other solutions, e.g. test a ganeti upgrade on real hardware, or test our re-imaging process. JHathaway (talk) 23:02, 29 July 2022 (UTC)

i think physical hardware environments are useful however im not sure it fits with this business case, specifically as its name "Disposable ...". I see this case more about creating the tooling and shared services for development environments as opposed to actually creating the environments. i also think that we should focus this proposal strictly on the development process and not things like staging or performance testing

I think it would be nice if the tooling could also spin up a development environment but my gut feeling is that would add significant amount of work to support, as a side not you can generally ask dc-ops for some spare hardware if you need to test something on physical hardware Jbond (talk) 09:21, 1 August 2022 (UTC)

Perhaps it would be helpful to add an out of scope section, which acknowledges that a physical staging environment would have unique benefits, but this proposal is targeting the many other development needs that can be served without having a duplicate physical environment? JHathaway (talk) 16:45, 1 August 2022 (UTC)

Sample Development Tasks

I think it might be helpful to add sample development tasks and show how the different development options might help their development. Here are some sample projects and how I would order which environment methodology would be the most helpful:

General Methodologies

  1. Physical Deployment
  2. Cloud Deployments
  3. Container Deployments

Sample Dev Work

  1. Deploying single sign-on solution for our infrastructure
    1. Container Deployments
    2. Cloud Deployments
    3. Physical Deployment
  2. Network automation change
    1. Physical Deployment
  3. Testing a Ganeti Upgrade
    1. Physical Deployment
  4. Adding config parameters to Apache
    1. Container Deployments
    2. Physical Deployment
    3. Cloud Deployments
  5. Testing a new version of Exim
    1. Physical Deployment
    2. Container Deployments
    3. Cloud Deployments JHathaway (talk) 23:07, 29 July 2022 (UTC)
    My personal view is that network automation changes should be very much out of scope of this. I think @Cathal Mooney has been playing with a tool similar to GNS3 which would probably be better for network automation testing.
    For "Adding config parameters to Apache" can you expand on why a physical environment is better then a disposable cloud environment. similar why is a physical environment the most preferred option for exim changes? My general view is that the cloud environment should be as close to production as to be no different from physical (other then performance). Containers ewill likley have some differences but we should aim for them to be small and known. so the difference between cloud vs container becomes a choice between accuracy (VM's) vs speed (containers) Jbond (talk) 09:26, 1 August 2022 (UTC)
    I think that is okay if networking automation testing is outside of the scope of this proposal, but I think it would be valuable to note that in the doc. JHathaway (talk) 16:46, 1 August 2022 (UTC)
    Apologies for the lack of clarity in my topic. I think of a physical environment as having the highest fidelity to production in comparison to other methods. That high fidelity makes it easier in some instances to have confidence that a change will behave as you expect in production. For example exim, or rather email in general relies on external systems, public dns, public internet, public mail servers. Ideally a physical environment would connect with those systems in a similar manner as production and provide confidence that a change, say to how we perform SPF checks in exim, will work as expected in production. Hopefully a cloud or container based solution would allow you to mock out some of those interactions, but I don't think it is an easy problem to solve. The apache config case is less compelling, as I agree with you that a cloud environment or container environment could get very close to production, but it is still nice to know that your curl command is hitting almost the exact same stack as it will in production. JHathaway (talk) 16:56, 1 August 2022 (UTC)