Once again, the benefits of traffic analyzers

    This week a very informative article on the use of a microscope for a high-quality fiber-optic connection was published on the hub . She reminded me how many of each specialty have its own tools, which, despite the obvious, seemingly usefulness, are not used by all. In system administration, one of these tools is a traffic analyzer . Under the cut, a typical story illustrating its usefulness (seasoned specialists will not find anything new there - the story is designed for beginners).

    Let's start, by tradition, from afar. The customer decided to upgrade their Microsoft Active Directory forest and domain from 2008 to 2012 R2. In fact, the need was only to upgrade to 2008 R2, but, given the complexity of such projects in a large environment (and the customer had more than 1000 Windows servers in dozens of geographically distributed locations), Service Owner decided to switch directly to 2012 R2 . Moreover, the current server build at that time was already Windows Server 2012 R2.

    In order to increase the functional level, you must first migrate all domain controllers to the new OS. The process is quite simple, from the point of view of Windows. All difficulties arise in those locations where integration of something third-party with the Active Directory environment is implemented. That is almost everywhere :)

    A listing of all the problems of that migration is material for several articles. Now we are only interested in one medium-sized location - two controllers, one thousand users, two EMC Celerra NAS devices (of course, hundreds of servers, databases and applications, but we won’t talk about them). In addition to shared resources, NASs were used to store user profiles. When there are two controllers in one site, the migration process is greatly simplified - we can migrate one controller and, if something went wrong, you can always put out it - the second remains (it is important to note that it has already passed successfully by this moment) migration of several locations and no one expected any special problems).

    So, day X has arrived and one of the controllers has been removed from the domain. We rearranged the OS and re-raised the role on it. It immediately became clear that this time it was not without problems. Users who received the new controller as Logon Server lost access to their profiles and shared folders. Instead, they saw a sad message:

    image


    We put out the problem controller, created a separate artificial site for it and added its IP address with a c / 32 mask there, transferred one test client there and started testing (yes, this could be started, but to save time and due to low risks, Service Owner allowed enable the controller immediately in a live site after the end of the working day). Recently there was an article about full-stack administrators. This, without a doubt, is very cool if you have the knowledge and rights on all devices to solve the problem yourself. Most often, the company has a rather strict division of commands into areas of responsibility and you technically cannot check the NAS settings while working in the Active Directory support team. It is clear that since the problem appeared after changing your infrastructure component, then the problems are, by default, on your side. How to find the cause of your troubles and get arguments for requesting some action from the other team?

    An invaluable tool will be a traffic analyzer. Here I am cunning a little - one of the important differences between Windows 2008 and Windows 2012 R2 is the new version of the SMB protocol, so I knew what the problem would be. My favorite tool in such cases is Wireshark (don't count it for advertising). Quick installation, launch capture, an attempt to access the shared folder, and what do we see with the packet exchange logs? Ioctl Response, Error: STATUS_INVALID_DEVICE_REQUEST

    NegotiateProtocol Request
    NegotiateProtocol Response
    SessionSetup Request
    SessionSetup Response
    TreeConnect Request Tree:
    TreeConnect Response
    Ioctl Request
    Ioctl Response, Error: STATUS_INVALID_DEVICE_REQUEST

    shows us that the SMB session between the user and the NAS device is not established. Given that everything works with the old controller, I received confirmation of my guess - the problem is in the new version of SMB. In general, NAS devices in the customer’s environment should support the new version of SMB (in other locations everything was fine), so the next idea was to look for whether to update the firmware for them. Bingo! The vendor forum confirms to us that the old Celerra firmware version does not support the updated SMB. Information is sent to the NAS support team along with packet exchange logs, links to the vendor’s website and a request for firmware update. The next weekend the firmware is updated and tests confirm - now everything works.

    As an afterword. When I recommend that my friends use a traffic analyzer to study a problem, the most common reason why a person does not want to do this is because he thinks it is very difficult. This is not true! In most cases, in order to understand what is happening, just look at the packet exchange log and sometimes read the KB article on how the protocol of interest to you is arranged. It is very simple. And it can save you a ton of time.

    Also popular now: