You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Wikimedia Cloud Services team/EnhancementProposals/2020 Network refresh/2020-11-10-checkin: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Arturo Borrero Gonzalez
(→‎2020-11-10 WMCS network checkin: add pointer to the NFS ideas document)
 
imported>Arturo Borrero Gonzalez
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
= 2020-11-10 WMCS network checkin =
= 2020-11-10 WMCS network checkin =
* Status updates
* Questions, feedback?
* Next to do's
== status updates from arturo ==
* cloudsw in codfw could be interesting
** test changes before introducing in eqiad


* cloudgw PoC in codfw going well:
* cloudgw PoC in codfw going well:
Line 13: Line 22:
** if we are fully happy with the codfw setup, then we should think on eqiad1
** if we are fully happy with the codfw setup, then we should think on eqiad1
** this requires procuring 2 HW servers (could be misc spares with 1x1GB NIC 2x10GB NICs)
** this requires procuring 2 HW servers (could be misc spares with 1x1GB NIC 2x10GB NICs)
** our plans for NFS will likely leverage this seutp: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/notes/NAT_loophole/NFS
** our plans for NFS will likely leverage this setup: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/notes/NAT_loophole/NFS
 
== Questions ==
 
* faidon: netns doesn't provide enough isolation of both realms (context: NFS design ideas)
** arturo: openstack does exactly the same in other services
** brooke: NFS doesn't provide direct escalation paths
** arturo: we are asuming risks when running the cloud in the same DC as prod, we share stuff. Openstack bridges things in nature.
* faidon: defense in deph approach
* andrew: we take prod management for granted. Is this in scope?
* faidon: can we drop VMs reaching the wikis using cloud addresses now?
* arturo: not in our KRs for this quarter
* arzhel: can we please work on reviewing ACLs this quarter? https://phabricator.wikimedia.org/T264993
* arturo: yes, I can do that
* birgit: key takeaway for NFS, how many layers of security we are adding vs the current model
* brooke: we are in the early stage of NFS evaluation: we are in the research phase, not in feedback collection phase.
* birgit: can we have a 3 top goals for the NFS project? like guiding principals, to make sure we keep on track (i.e.:high performance for toolforge users, more security layers etc).
* faidon: add realm boundaries to diagrams, to make sure where prod/cloud start/end.
* arturo: any concerns with the cloudgw project moving forward?

Latest revision as of 15:57, 10 November 2020

2020-11-10 WMCS network checkin

  • Status updates
  • Questions, feedback?
  • Next to do's

status updates from arturo

  • cloudsw in codfw could be interesting
    • test changes before introducing in eqiad
  • cloudgw PoC in codfw going well:
    • neutron accepted to work without doing the SNAT, now being done by cloudgw.
    • VMs now use floating IPs (if they have one) for connections outside the cloud. No shortcomings detected so far, other than refreshing firewalling and other ACLs.
    • neutron is happily running without our custom hacks.
  • next steps, address some limitations:
    • no NIC bonding in data plane. Would like to try it out, requires DCops work for the additional cable patch.
    • no HA. Would like to try it out, requires another server in codfw.

Questions

  • faidon: netns doesn't provide enough isolation of both realms (context: NFS design ideas)
    • arturo: openstack does exactly the same in other services
    • brooke: NFS doesn't provide direct escalation paths
    • arturo: we are asuming risks when running the cloud in the same DC as prod, we share stuff. Openstack bridges things in nature.
  • faidon: defense in deph approach
  • andrew: we take prod management for granted. Is this in scope?
  • faidon: can we drop VMs reaching the wikis using cloud addresses now?
  • arturo: not in our KRs for this quarter
  • arzhel: can we please work on reviewing ACLs this quarter? https://phabricator.wikimedia.org/T264993
  • arturo: yes, I can do that
  • birgit: key takeaway for NFS, how many layers of security we are adding vs the current model
  • brooke: we are in the early stage of NFS evaluation: we are in the research phase, not in feedback collection phase.
  • birgit: can we have a 3 top goals for the NFS project? like guiding principals, to make sure we keep on track (i.e.:high performance for toolforge users, more security layers etc).
  • faidon: add realm boundaries to diagrams, to make sure where prod/cloud start/end.
  • arturo: any concerns with the cloudgw project moving forward?