You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Traffic cache hardware: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>BBlack
imported>BBlack
Line 18: Line 18:
|-
|-
| F3 || Dell R440 || 2x Xeon Gold 5118 @ 2.3Ghz || 24 || 384 GB || 1x Samsung PM1725b 1.6TB NVMe (HHHL Card) || 10/25G, BCM57412, bnxt || F-10G, +storage card, +10/25G NIC variant
| F3 || Dell R440 || 2x Xeon Gold 5118 @ 2.3Ghz || 24 || 384 GB || 1x Samsung PM1725b 1.6TB NVMe (HHHL Card) || 10/25G, BCM57412, bnxt || F-10G, +storage card, +10/25G NIC variant
|-
| F4-T || Dell R450 || 2x Xeon Gold 5318Y @ 2.1Ghz || 48 || 512 GB || 1x 6.4TB NVMe Card || 10/25G, BCM57414, bnxt || Proposed for Text, mid-2022 *
|-
| F4-U || Dell R450 || 2x Xeon Gold 5318Y @ 2.1Ghz || 48 || 512 GB || 2x 6.4TB NVMe Card || 10/25G, BCM57414, bnxt || Proposed for Upload, mid-2022 *
|}
|}
* - F4 is our proposed new Config F, not yet deployed anywhere.  The new Config F is exclusive to these edge cache roles, and thus includes the storage cards and NIC upgrades as part of its base definition.


= Deployed Server Hardware by Datacenter and Cluster =
= Deployed Server Hardware by Datacenter and Cluster =
Line 41: Line 47:
|}
|}


= End State of Proposed FY22-23 changes + refreshes =
* ulsfo and eqsin get refreshed to new-standard 16xF4 config in first half of the FY
* The 8x off-cycle (newer, still in warranty) F-nodes in drmrs and eqsin are shipped to eqiad
* Eqiad installs these into the new E+F rows for a number of reasons:
** Utilize the new rows in eqiad in general (more load/redundancy spread)
** Test impact of expanded server counts in general
** Re-use this good hardware instead of tossing it, so we don't waste it just for being off-cycle purchased
** Buys us time to push natural eqiad warranty refresh out another FY, spreading out refresh cycles better (too many this year!)
** Allow the F4 refreshes in drmrs+eqsin to be whole-DC upgrades, since F4 enables whole-DC architecture changes in traffic routing.
* esams gets refreshed in Q4 to the same new F4 config as ulsfo+eqsin (we have some time and space to adjust this based on earlier outcomes if necc)
{| class="wikitable"
|-
! Datacenter !! text !! upload !! total !! notes
|-
| eqiad || 8x F1 + 4x F2 || 8x F1 + 4x F3 || 16x F1 + 4x F2 + 4x F3 || Reinforced this FY
|-
| codfw || 8x F2 || 8x F2 || 16x F2 || no changes this FY
|-
| esams || 8x F4-T || 8x F4-U || 16x F4 || refreshed to F4 in Q4
|-
| ulsfo || 8x F4-T || 8x F4-U || 16x F4 || refreshed to F4 in Q1
|-
| eqsin || 8x F4-T || 8x F4-U || 16x F4 || refreshed to F4 in Q2
|-
| drmrs || 8x F3 || 8x F3 || 16x F3 || no changes this FY
|-
| total || 52x Fn || 52x Fn || 104x Fn ||
|}
[[Category:Caching]]
[[Category:Caching]]

Revision as of 13:36, 27 May 2022

This is an overview of our currently deployed and active cache hardware at the Traffic layer.

Deployed Hardware Classes

We have purchased and retired multiple classes of server hardware over the years in staggered timeframes. In the general case we'll always have multiple overlapping classes of hardware as various warranty and support periods expire. These are the currently-active hardware configuration classes:

Label Model CPU Type/Speed Phys Cores RAM Cache Storage NIC speed, type, driver DC Ops Config
L Dell R430 2x Xeon E5-2650 v4 @ 2.20GHz 24 384 GB 2x Intel S3710 800GB SSD 10G, BCM57810, bnx2x Legacy
F1 Dell R440 2x Xeon Gold 5118 @ 2.3Ghz 24 384 GB 1x Samsung PM1725a 1.6TB NVMe (U.2 SFF) 10G, BCM57412, bnxt F-10G, +storage card
F2 Dell R440 2x Xeon Gold 5118 @ 2.3Ghz 24 384 GB 1x Samsung PM1725b 1.6TB NVMe (HHHL Card) 10G, BCM57412, bnxt F-10G, +storage card
F3 Dell R440 2x Xeon Gold 5118 @ 2.3Ghz 24 384 GB 1x Samsung PM1725b 1.6TB NVMe (HHHL Card) 10/25G, BCM57412, bnxt F-10G, +storage card, +10/25G NIC variant
F4-T Dell R450 2x Xeon Gold 5318Y @ 2.1Ghz 48 512 GB 1x 6.4TB NVMe Card 10/25G, BCM57414, bnxt Proposed for Text, mid-2022 *
F4-U Dell R450 2x Xeon Gold 5318Y @ 2.1Ghz 48 512 GB 2x 6.4TB NVMe Card 10/25G, BCM57414, bnxt Proposed for Upload, mid-2022 *
  • - F4 is our proposed new Config F, not yet deployed anywhere. The new Config F is exclusive to these edge cache roles, and thus includes the storage cards and NIC upgrades as part of its base definition.

Deployed Server Hardware by Datacenter and Cluster

Datacenter text upload total
eqiad 8x F1 8x F1 16x F1
codfw 8x F2 8x F2 16x F2
esams 8x F2 8x F2 16x F2
ulsfo 6x L + 2x F3 6x L + 2x F3 12x L + 4x F3 = 16
eqsin 6x L + 2x F2 6x L + 2x F2 12x L + 4x F2 = 16
drmrs 8x F3 8x F3 16x F3
total 12x L + 36x Fn = 48 12x L + 36x Fn = 48 24x L + 72x Fn = 96

End State of Proposed FY22-23 changes + refreshes

  • ulsfo and eqsin get refreshed to new-standard 16xF4 config in first half of the FY
  • The 8x off-cycle (newer, still in warranty) F-nodes in drmrs and eqsin are shipped to eqiad
  • Eqiad installs these into the new E+F rows for a number of reasons:
    • Utilize the new rows in eqiad in general (more load/redundancy spread)
    • Test impact of expanded server counts in general
    • Re-use this good hardware instead of tossing it, so we don't waste it just for being off-cycle purchased
    • Buys us time to push natural eqiad warranty refresh out another FY, spreading out refresh cycles better (too many this year!)
    • Allow the F4 refreshes in drmrs+eqsin to be whole-DC upgrades, since F4 enables whole-DC architecture changes in traffic routing.
  • esams gets refreshed in Q4 to the same new F4 config as ulsfo+eqsin (we have some time and space to adjust this based on earlier outcomes if necc)
Datacenter text upload total notes
eqiad 8x F1 + 4x F2 8x F1 + 4x F3 16x F1 + 4x F2 + 4x F3 Reinforced this FY
codfw 8x F2 8x F2 16x F2 no changes this FY
esams 8x F4-T 8x F4-U 16x F4 refreshed to F4 in Q4
ulsfo 8x F4-T 8x F4-U 16x F4 refreshed to F4 in Q1
eqsin 8x F4-T 8x F4-U 16x F4 refreshed to F4 in Q2
drmrs 8x F3 8x F3 16x F3 no changes this FY
total 52x Fn 52x Fn 104x Fn