The Gathering Technical blog

Short status update…

17 Apr 2014, by from Tech:Net

Some of you may have experienced some problems with the internet, the wireless and the network in general. We have had some minor issues with the internet link, with the internal routing and the wireless. Everything was on track and working before 09 Wednesday morning, but we never really know how well things work before at least a few thousand participants actually arrive and connect to the network and put some load on it.

*The wireless: *
We had some small problems with the servers to start with, and then some small problems with the configuration. The main problem here was that we had to prioritize the cabled network.

We are still working on improving the wireless solution and hope that we have everything optimized by tomorrow morning.

The internal network:
We don’t have one specific problem to point to, more like hundreds of small problems. The list is long and it contains everything from bug in software to missing parts and some human error. But there have not been any major incidents.

The internet, which is a two part problem:
1. We have 4x10Gig links in a port bundle down to Blix Solutions in Oslo. These were connected and tested OK on Friday. When participants arrived on Wednesday and the links became loaded with traffic we started to see problems with the load balancing. We removed two ports that weren’t performing well from the bundle and continued on 100% working 20Gig (2x10Gig).

This morning, around 11:00, SmartOptics arrived with new optical transceivers and converters. They checked the transceivers on the links we had problems with using an optical microscope and could see that they weren’t completely clean. Using special cleaning sauce, they managed to remove the dust and dirt from our transceivers, leaving it to us to put them back in the bundle, now in 100% working condition. Next year we’ll make sure to be more adamant about this before patching things together.

2. Origin, Steam, Blizzard, NRK, Microsoft, HP, Twitch… Some of these services rely on geolocation. There are multiple providers of geolocation service (like MaxMind), but the services usually charge money per database pull. This means that the cheaper the companies are, the longer between every pull. This means that we can be seen as being in Norway for some services that update often, but in Russia, Puerto Rico, Italy or Antarctica etc from companies that pull data from the geolocation database less frequently.

The reason for this is because our IP-address range is a temporary allocation from RIPE. RIPE has a pool with IP-addresses they lend out for a short amount of time to temporary events. This means that we are not guaranteed to get the same IP-addresses every year and that a lot of different events in different countries have been using the allocation in the months before us.

We are working continuously to solve this. We talk to Origin/EA and Valve, we try to NAT the most known and most used services through permanent Norwegian IP-addresses and we do ugly DNS-hacks. The sad fact is however that in the limited amount of time we have during TG, we won’t be able to solve this for every service.

How to get optimus prime awesomesaucelicious wireless

16 Apr 2014, by from Tech:Net

5GHz… Connect to the broadcasted ESSID: “The Gathering 2014” <- this one is only 5GHz and you are 100% surest to getest the bestest and freshestest frequencies. YAY!! 😉

Legacy clients with only 2,4GHz can connect to the “The Gathering 2014 2.4Ghz” ESSID.
2,4GHz is only best effort – the main focus is stable 5GHz

The password for both is: Transylvania
N.B. with capital T

We has intarwebz

11 Apr 2014, by from Tech:Net

Woot oO \:D/

Speedtest 3

TeleGW#sh int po 1
Port-channel1 is up, line protocol is up (connected)
Hardware is EtherChannel, address is 001a.e316.a400 (bia 001a.e316.a400)
Description: Interwebz
Internet address is 185.12.59.2/30
MTU 1500 bytes, BW 40000000 Kbit, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 10Gb/s, media type is unknown
input flow-control is on, output flow-control is off
Members in this channel: Te5/4 Te5/5 Te6/4 Te6/5

TeleGW#show ipv6 interfac po1
Port-channel1 is up, line protocol is up
IPv6 is enabled, link-local address is FE80::21A:E3FF:FE16:A400
No Virtual link-local address(es):
Description: Interwebz
Global unicast address(es):
2A02:ED01::2, subnet is 2A02:ED01::/64

TG14 Design

04 Apr 2014, by from Tech:Net

TG14-Design

Some numbers from the Cisco kitlist:
5 4507R+E switches with redundant supervisors
166 wireless access points (mix of mainly 3602 and 2602, aswell as 3502 and a few 1142)
2 5508 wireless controllers
7 4948E switches
2 4500-X switches
2 6500 switches
150+ optical 10G transceivers

TG network since 1996

04 Apr 2014, by from Tech:Net

Asle, a guy who is very interested in networking and maybe more specific – the TG network, made this summary of the TG network (primary network vendor).

He have been crawling the internet up and down to locate as much accurate info about the core network as possible.

Check it out here: http://leknes.info/tgnett

The instability in the network at TG13

03 Apr 2014, by from Tech:Net

There’s been a lot of questions around the instability in the network at TG13. We are very sorry for not explaining this during TG13 or any time sooner then this. We are very, very sorry. But behold, the explanation is here… and I can say it in one short sentence:

We had a defective backplane on one of the two supervisor slots of the 6500 that terminated the internet connection.

First a short intro to Cisco, Catalyst 6500, sturdiness and reliability: the Cisco Catalyst 6500 is a modular switch that have been in production since 1999. It’s packed with functionality and is one of the most reliable modular switches as we know it.

You may ask, why did we end up with a defective one if they are so reliable… and you are right to ask. The answer is split in two… one: they have produced and sold many, many thousands of units – and even if they have very strict test routines there will always be some troublesome units. Two: this type of equipment is not meant to be shipped and moved back and forth as much as the Cisco Demo Depot equipment is. In the end the equipment will show problems caused by all the moving back and forth, not to mention the line cards always replaced.

We would have run the equipment in extensible testing, if we were a regular customer. And as a regular customer with a service agreement would we have contacted the Cisco TAC (Technical Assistance Center) and registered a service request. They would have either solved the problem remote if it were software bug or configuration error or RMA (return material authorization) the faulty hardware. This takes time, and that is something we usually calculate for when building a new, big network.

At TG we test all the equipment before we ship it to Vikingskipet. On the 6500 and 4500 we run the command “diagnostic bootup level complete”, and reboot the switches. When they are up again we have a complete diagnostic of all the line cards of the switch – and we can easily spot if something is wrong. Both our 6500 last year passed the tests, so we shipped them up to Vikingskipet and configured them and everything was seemingly ok.

But not everything was ok. Not at all. We noticed something weird on Wednesday after the participants arrived. We had some strange latency and loss on the link between us and Blix Solutions. The interfaces did not have drops or errors, we just saw that more packets went out than return packets. We involved first Blix Solutions in the investigation, and they checked everything at their end and everything was ok there, so we involved Eidsiva bredbånd. They checked everything between Blix Solutions and us, and everything was fine also there.

This started to get stranger and stranger… Traffic was going out, but didn’t come back. No interface drops. No interface errors. Peculiar.

The setup was quite simple… 4x10Gig interfaces in a port-channel bundle towards Blix Solutions. 2x10Gig interface (X2) in each supervisor (Sup720).

So after some testing and some different hypothesis during Wednesday, we waited until the day was over and we went towards the Thursday morning and the traffic was dropped beneath 8Gig. At once it was stable beneath 8Gig, we forced all the traffic over to one and one 10Gig interfaces – first in the top most supervisor. The traffic was going smooth as silk, and I’m very sorry to say – but we did actually fail them over, one by one, to the 10Gig interfaces on the last supervisor for five minutes – and guess what? The internet traffic had drops and we have latency… We forced them back to the first supervisor and everything went smooth again.

We had an answer for why we had the problems, but then we had to find the solution to solve the problem on the fly, in production, before 6000 nerds woke up to life and started using internet traffic again! OMG! WHAT A PRESSURE! oO

As you may know are the transport between Vikingskipet and Blix Solutions “colored”, and we only had colored optical transceivers in the SFP+ format and SFP+ converter in X2 format. The rest of the 10Gig interfaces on the 6500 was XENPAK.

Luckily for us, Blix Solutions had put a TransPacket-box in each side, and this box can be configured with the right wavelengths (colors) on the interfaces. That gave us the possibility to translate the wavelength through the TransPacket-box and receive on a 10Gig XENPAK interface on regular 1310 single mode wavelength.

But this was not until a little bit out into the day, Thursday. So all instability you may have experienced on Wednesday and “early” Thursday was caused by this. After this, there should not have been any troubles or instabilities in the network as far as we know.

At the same time as we explain this we also have to applaud both Eidsiva bredbånd and Blix Solutions for their service. They were working with us every step of the way to find a solution. And we can really say we know now why Blix has the name they have… Blix Solutions – because they are very open and forward about finding a solution, however far fetched or unrealistic it is.

Thank you both Eidsiva bredbånd and Blix Solutions for the cooperation in TG13!

We really look forward to working with both of you for TG14 🙂

TG14 Tech-solutions…

02 Apr 2014, by from Tech:Net

So… we have been quite busy between TG13 and TG14. We have been working intensely on how we can make TG even better and future proof, both for us – the entire crew and last but not least and actually the most important: THE PARTICIPANTS! 🙂

We are happy to announce that the internet connection will be provided through the same partners that contributed last year. Blix Solutions AS will be the one providing the internet capacity at their data center in Oslo, same as last year 30Gig (with optional +10Gig), while Eidsiva bredbånd AS will take care of the L1 transport link between Oslo and Hamar. SmartOptics AS will provide us with the optical transceivers the link requires.

We had a very good experience working with Blix Solutions, Eidsiva bredbånd and SmartOptics and we look forward to work with them again! 🙂

Blix is also sponsoring the servers to TG, which makes it possible with for instance DNS, DHCP, MBD, Stream and much more.

The core network equipment will be delivered by one of our main sponsors, Cisco, and we are very happy and proud to be working with them yet again. Same as last year are we building the hierarchical model this year also. Core – Distribution – Access. We build with the same network equipment as last year with but a few minor upgrades.

The most interesting with this year would be that we will continue to focus on the Wireless network. Last year was the first time we provided a full scale Wireless experience, and we learned a lot! We hope to get the Wifi working even better for TG14. Our goal is to have a stable 5GHz service from day one. This is really a big challenge based on the hostile environment we are given, but this year Cisco have provided us with the stadium wireless (same as they have in Telenor Arena), and we are really going for it!

We will also provide a new internal service for the crew, which is VoIP (or IP Telephony). This will help all our staff to communicate much easier when they need to have a conversation that is not proper to have on a walkie-talkie. All in all, we are set on making the situation better for everyone, and a big thanks to Cisco for making it possible!

Atea is still one of our main sponsors and Tech’s main contributors. They provide transport, shipping handling, cars, equipment and people. Working under a non-profit organisation we don’t have all the money in the world and we are very much in the good grace of our very nice sponsors to achieve what we do and continue develop.

Lynet Internett will sponsor us with the webcams this year. This gives everyone a chance to get a glimpse of all the awesomeness going on. The web cameras will be available from approximately Saturday/Sunday and throughout the event.

Regarding the fiber installation we are working on these days, you can read the post about our first trip to Vikingskipet for installing the cables here: http://technet.gathering.org/2014/03/18/fiber-optics-improvements-in-vikingskipet-01/

And here:
http://technet.gathering.org/2014/03/31/fiber-optics-improvements-in-vikingskipet-2-and-3/

We wish to mention the main contributors that made the fiber installation possible:
Tessta & Ninjafiber – for lending us fiber rollers, thank you! 🙂
Atea – for lending us fiber welding and testing equipment, cars and resources. Thank you! 🙂
eMCOM – for lending us fiber welding equipment for a very nice price 🙂
HOA – especially Tor Arne and Knut for being very forthcoming and helping with everything. Thank you 🙂

And a big thank you to the contributors for the design, pricing, purchase, project management, installing and everything regarding the labor around the fiber upgrade:

Fredrik Haarstad
Designing, pricing, acquired equipment and installing.
Member of Tech:Net, works for Lynet Internett

Martin Karlsen
Project manager and installation.
Member of Tech:Net, works for Marcello Consulting

Jostein Grepperud
Expert fiber installer.
Worked previously in Relacom – works now for Marcello Consulting – Thank you! 🙂

Mathias Bøhn Grytemark
Expert fiber installer.
Member of Tech:Net – worked previously in Relacom – works now for Vestviken 110 (emergency services)

Espen K. Olsen
MacGyver, installing and fixing odds and ends.
Member of Tech:Net – works for Atea.

Erlend Røsok
Network Ninja, planning, installing and fixing odds and ends.
Member of Tech:Net – works for MET (Meteorologisk institutt).

This years’ core routing; Windows 98SE! :D

01 Apr 2014, by from Tech:Net

We got a bunch of old workstations with Windows 98SE donated, and since we love challenges we will build this year’s network on Windows 98 routing. We have tested a little bit in lab and the results are quite good, so we are very optimistic about this.

Here are some pictures of the setup:

route4

route1

route2

route3

route5

route6

We thank Fnutt Consulting (http://fnutt.net/) for the contribution in workstations! 😀

Fiber optics improvements in Vikingskipet #2 and #3

31 Mar 2014, by from Tech:Net

We have had two more weekends in Vikingskipet since last time. We finished all the fiber welding, mounting the ODF’s and tested all the cables. Only a few cables have to be welded a second time of 444 fibers spread over 13 ODF’s stretched over 1km (1000m) cable.

Here are some juicy shots from the two past weekends:

2014-03-21 20.44.40

2014-03-21 20.45.23

2014-03-21 21.04.23

2014-03-21 21.10.53

2014-03-21 21.58.39

2014-03-22 12.15.04-1

2014-03-22 20.40.36

2014-03-22 20.52.46

2014-03-23 14.42.31

2014-03-23 14.47.15

2014-03-23 19.36.09

DSC01341

DSC01348

DSC01352

DSC01359

DSC01369

DSC01376

DSC01382

DSC01401

DSC01407

DSC01422

DSC01428

DSC01436

DSC01451

DSC01453

DSC01457

DSC01459

DSC01478

DSC01484

DSC01487

DSC01510

DSC01519

DSC01533

About

TG - Technical Blog is the unofficial rambling place for The Gathering.