Author Archives: jawtheshark

About jawtheshark

I'm a computer scientist by heart, system and network engineer by day and usually sleeping at night (what did you expect?). I live in the only Grand Duchy on this planet.

Seven to ten to seven

Seven to ten to seven

“Seven to ten to seven”

As you undoubtedly know, for now my recommendation about Windows 10 is: Stay put when you’re on 7, upgrade when you’re on 8/8.1. If you disagree, that’s fine: do what works for you. Of course, there is an “if”, namely, you’d better upgrade to 10 in order to secure the 10 upgrade for free before the promotion ends. As such, I’ve been a busy bee, taking Windows 7 machines, making an image of their disk, then upgrade and the revert to the 7 image.
Technically, you can upgrade to 10, ensure your machine is activated and then click the “revert to 7″ button in the “Upgrade” section somewhere. You have 30 days to do this. Now, personally, I prefer the “image-upgrade-restore” process because you do not know what Microsoft does when you click the rollback button. Is your machine hash flagged? Well, you get to say what you think of 10, but there is most likely not a human soul that will ever see these complaints.

Being more the Unix guy, I automated my work as far as possible. The automation consists of three parts: an imaging script and two windows scripts (reg and cmd). The first script is actually rather old and was originally written for other purposes: image newly bought PCs. It uses parted, so I assume that it should work on GPT partition layouts, but I have never tested this.

Now, to be entirely honest, you’re not going to manage to do the imagining without a little crash course on devices and the Linux command line. (Only tested on Ubuntu 14.04 LiveUSB. Dependencies are: ntfsclone, dd, dmidecode, hdparam and probably another few)
Basically, you’ll run it as following: sudo ./generate-image.sh /dev/sda
However, this assumes a few things: your working directory has my script, that in this working directory you have enough space to store the generated images and that the disk you want to image is /dev/sda (which it most likely will be, but I cannot say for sure). You also need to be sure that no partition of /dev/sda is not mounted. (Hey, now that’s something I could add to my script…)
When you run that script, it will create a directory based on your machines information, and will attempt to image the mbr (full and without partition table), and all partitions. For vfat it reverts to dd, for ntfs it reverts to ntfsclone and it generates a restore.sh script for your convenience for easy restoration. I’d say: cool, but you may think otherwise.

Nevertheless, I have decided to publish it here for the nerdier guys.

So, then you upgrade to 10, wait until it’s activated and that’s the last you’ll see from Windows 10.

Now, you boot back to your LiveUSB, go to the image directory the script created and run sudo ./restore and it will restore everything magically. If you want to use the backed up partition table, give any parameter (it’s a bit dumb, yes…).

When it’s all done… Reboot. You’re back to your Windows 7 machine as if nothing ever happened.

Now for the part any Windows user can do. The two scripts in the privacy.zip, are privacy.cmd and privacy.reg. The reg file you can just double-click, and it will essentially mark your machine as being “not interested in Windows 10, don’t bother me any more”. It disables GWX (the Windows 10 notification icon), disables the upgrade function, disables reservation and disables the fact that recommended updates are treated like important updates. This is important, because Microsoft used the “recommended” channels to push these -let’s just say “annoying”- patches to your computer.

The privacy.cmd script does something entirely different. If you haven’t been living under a rock the last months, you know that Microsoft pushed patches that adds a tracking services to your pristine Windows 7 installation. Now the script starts off with stopping that service, and then disabling it. I do this, because the uninstallation of the offending patches might fail for some reason. At least, then you’re sure the service is off. After it has done this, the script tries to uninstall the patches related to the Windows 10 upgrade and the tracking service.
Be advised, in order for the privacy.cmd script to work you need to run it as Administrator. Right click on it, then select “Run as Administrator”. It might take a while.
Congratulations, the nagging for the upgrade should stop, until Microsoft decides to push it as an important upgrade. After a reboot, you may want to manually mark these patches as hidden. Perhaps I should try to figure out, whether you can do that with a registry patch too.

Upgrade to Windows 10 or not?

Pit Wenkin asked me regarding my thoughts about upgrading to Windows 10 or not.  It ended up being a rather large post, so I decided to write it down as a blog post:

What do I recommend?  You’re asking this a Linux user.

For starters:
– If you are a Windows 8 user, do upgrade… Now… It is better than Windows 8.
– If you are a Windows 7 user, you are between a rock and a hard place.  Windows 10 is not better than 7, at least not in my eyes.  Windows 7 is end of life in January 2020 (Source: microsoft.com), which means security patches should come in until then.  However, your “Free” upgrade is only valid one year.  You have to upgrade NOW, or you are losing money.
– The reviews of 10 are generally positive, but… the arguments are always the same: it’s a Windows 8 underpinning (which, allegedly has a bit more “under the hood improvements”) with a more 7 like interface.  It’s still the ugly flat interface, though.  It always stops with “Hey, it’s free, you should take it”.  I personally find that one of the worst arguments for an upgrade.

Knowing this, you have to balance out the following:

  1. Will Microsoft keep their promise regarding EOL status of 7?  If we can see back in history, we know they won’t.  Both NT 4.0 and Windows 2000 didn’t get important security updates before their EOL because “it was too much work for the short time”.  The answer Microsoft could give is: Hey, Win10 is free, upgrade to that.  It would be a arsehole move, I admit, but look deep into your heart:  How much do you actually trust Microsoft?
  2. How long are you going to keep your device?  If you’ve got a machine and think you’re going to replace it anyway before 01/2020, you have no reason to upgrade (ignoring point 1).  Just keep on shrugging happily with Windows 7, and your new machine will be 10 anyway (or a Mac, please buy a Mac or ask me to install Linux!)
  3. Given point 2.  Keep in mind that machines have longer lifespans these days.  Even if you get a new machine every three years or so, it’s most likely going to have a life after your usage.  Which means, it’ll better have Windows 10.  It increases it’s “value” in the sense that it will get continued patches once it’s in someone else’s hands.  Now, you might not care and that’s fine.  I am just pointing it out.
  4. How much time do you have spare?  It’s quite simple.  If you do the upgrade now, and the immediately roll back (Yes! You can do that!), your machine is registered as being upgraded.  The main issue here is that we do not know how much the hash Microsoft has about your machine, will change on diverse hardware upgrades?  Does a disk change modify the hash?  Does a RAM upgrade do?  We only know for certain a motherboard swap does.

This brings us to my plan for my family & friends machines, and the one I did on my Ultrabook1.  I will take their machines, one by one, and upgrade it to 10, then revert back to 7.  That way, in 2020, they can go to 10 (because they have to), and keep on using 7 meanwhile.  Should anyone care to go to 10 voluntary, they will be able without paying.  At least, that’s the theory.  This will waste a lot of my time and a shitload of bandwidth, but it’s the best balance I found between point 1-4.
I am going to test what happens if I do a disk swap, instead of a dd clone (that takes so long).  If I can get a machine to upgrade with HDD A, and then use another HDD B to do an install from scratch and it activates fine, I don’t need to do the upgrade on the actual installation (aka, the one people use) and it’s only downtime for the users.

1 My Ultrabook came with Windows 8.  It never actually booted into 8, because I dumped Linux on it.  From day one.  Now, since I do care about the people “after me”, I did the following:  I made a dd clone of the disk, then I installed Windows 8, then I upgraded to Windows 10, then I restored the dd clone of the disk.  It took over three days (in the sense, I did one operation every evening and let it work overnight).  This is the roadplan, I have for Windows 7 machines.  Secure the upgrade, continue using the old and trusted.

Windows 10 upgrades – I’m becoming highly sceptical

If you’ve been following my progress on Facebook, I am getting very sceptical regarding the Windows 10 upgrade process.  The word in the street is that, if you have a legit installation, and do the upgrade from your Windows 7 installation, your key -printed on the famous sticker- is going to be “upgraded” to a 10 key.  (Ignoring Windows 8 for now, as the keys are in firmware)

Now, fate happened to give me a defective computer just before Windows 10 got released.  My sisters computers hard drive died and it required a full reinstall.  My sister has a System Builder version of Windows 7 Pro.  It is 100% legit, has never been installed on any other hardware and has basically only been installed once, a few years ago, when she bought the hardware.  Ideal situation.
Since I finished the 7 install, but didn’t have the time to go on with the installation, I decided to let it upgrade and, as such, make sure her key is both valid for 7 and 10.  Regardless of what you think about 10, we all know that a fresh start (complete reinstall) is always preferable.  So, I decided to download Windows 10 USB stick creation tool, and create a bootable Windows 10 USB installer. (On her computer, from the upgraded 10 version, no less!)  The word on the street is that, after a successful 10 upgrade, you could install from scratch.

So, I launch the installer and it asks me the key…  The key that -according to the word on the street- should have been upgraded during the, ehm, upgrade.  Not so… It didn’t take it.  I find this highly worrying.  If these key are not updated, future reinstalls will not work and sooner or later the “Install 7/8, the upgrade” will become paying.
I now tried “Skip” and reinstall it from scratch any way.  Perhaps network connectivity is missing or so, and that’s why it doesn’t work.  If not, I foresee huge problems in the future when re-installations of 10 are needed on initially upgraded machines.

If the “install first, then enter key and activate” scenario fails, I give up on Windows 10 for my family and they’ll have to live with 7.  Which, to be entirely honest, is still superior.

Update 2015-08-1@23:31CEST

It makes sense now.  What really happens is that you seem to get a new key.  It is not even a special key, everyone gets the same one.  What really seems to happen is that a hardware hash is sent to Microsoft to identify the machine associated with the OEM key (I have no retail keys to test).
So, every time the installer asked for a key, I skipped it, ending up on a desktop which was… activated!  So, yes, you can reinstall your machine freshly after you did an upgrade, it just is really, really, really dumb about it.  The user (me in occurrence) is left with the idea he has a bad key, but the importance of the key is gone.  At least not the key you have that you used for the upgrade.
Now, keep in mind this has a bitter after-taste.  Re-using OEM licenses, as was totally legal in the EU, suddenly became much harder, if not impossible  Also, if you decide to stay with 7, and upgrade your hardware in the next few years, and in 2020, you say… “Hey, I had this 10 license, I can do that upgrade for free, still”, your hash might have changed and you’ll be out of luck too.  Pray for static hardware if that’s the path you choose to go.

A little declaration of love

My beautiful wife, Nathalie

My beautiful wife, Nathalie

I know I’m not as attentive as I should be.

I know that, I often say the wrong things at the wrong moment, for the wrong reasons.

I know that, at a certain point in our lives, I failed to be there for you when you needed me most.

I know, it hasn’t always been easy, but we still seemed to find a way.

I know I make jokes about the woes of the married man, and play the repressed husband all the time.  It’s supposed to be a running joke, but you always take it so  seriously.

I also know, that despite the fact that we couldn’t be more different in pretty much every aspect, I wouldn’t want to miss you in my life.  So, just in case you forgot it: I love you.  Perhaps even more today, than ten years ago, when I became your husband at the Mairie in Mamer.

Happy tenth anniversary, my beautiful wife, Nathalie!

Used up the gandi.net coupon codes

Since nobody wanted my coupon codes, I ended up getting the following for a whopping 1.17€ (incuding VAT)

  • dynamic-ip.xyz
  • home-ip.xyz
  • home-server.xyz
  • luxtrust.me
  • whiskeytangofoxtrot.eu

The first three are obviously for my Dynamic DNS service.  Want a subdomain one one of these (or ipv4.lu)?  Give me a sign.

The whiskeytangofoxtrot.eu one was just because I tried so many .eu and nothing interesting was free, so I typed that and it was available.  No idea what I’m going to do with that.  Dynamic DNS service is also ok, I think.

Now, luxtrust.me came on a whim.  These .me domains work best with a verb, like “help”, “instruct”, etc… Obviously, “trust.me” came up in my mind and even more obviously, it was taken.  But, luxtrust.me was free.  Too good to let it slide.
I might setup email forwarding for it, so I can give LuxTrust an email like “angrycustomer@luxtrust.me”.  I’ve got 1000 free forwards, so I might even give them away to anyone who wants. *grin*

The attentive reader will notice that the .com at 50% off is not listed.  That’s the only one I managed to give away.

Free .xyz domain

gandi.net

gandi.net logo, downloaded from the Gandi Image Library

As I mentioned in my .lu 3-letter domain analysis, I got a few codes at Gandi to get certain TLDs for free or at discount for one year. In all fairness, I have not much use for them and the cool ones seem to be long gone. After the free first year, I’ll have to pay for them any way. I have 9 domain names (ignoring those where I’m marked “technical contact” only and ignoring willekens.lu, owned by my father), which totals to 202.50€ per year. For a private person and just for a hobby, that’s do-able, but not exactly cheap. Okay, hobbies cost money. C’est la vie.

I also, offer Dynamic DNS services to friends and family. I have two domains dedicated to this, namely ipv4.lu and homesupport.lu. The former is for nerdy friends over the world, who want their home connection to be accessible. The latter, I mainly use for family so I can support them using RDP over SSH or VLC over SSH. You can’t ask me to get one under that domain, I decide that for you.

Basically, you could have something like marvin.ipv4.lu pointing to your home connection, and all through the magic of some scripts and two VPSes I rent (one in the US, one in the EU, for redundancy). I can, however, understand that the ipv4.lu doesn’t appeal to many non-Luxembourgish nerds.

So, if there would be any interest, I could get a few extra domains for the Dynamic DNS service. The codes at Gandi, I have are:

  • 3× .xyz for free
  • 1× .me for free
  • 1× .eu for 1€
  • 1× 50% off for a .com

I checked, stuff like “home-ip.xyz”, “home-connection.xzy” and “home-server.xyz” are still available. Personally, I have trouble finding anything with .me that makes sense and is available and the rest are for-pay. Now, of course, if you have a great idea for either of the ones listed above, you can tell me.

Now, if you think my Dynamic DNS service is good as-is and really, really, really need that the coupon for your own usage, just ask nicely. I might even be in a good mood and just give it to you. The coupons expire the 1st May 2015.

Three letter ccTLD domains

The Ring of ccTLDs #3

The Ring of ccTLDs #3 by Grey Hargreaves.
Creative commons license, found on Flickr.

My registrar of choice, Gandi, had its 15th anniversary this month. Apparently, I’ve been a customer for 15 years too. Has it been that long? Anyway, they gave away prizes and I’ve got codes for three free .xyz, one free .me, a .com at 50% and a .eu at 1€. To be entirely frank, I have no idea what to do with any of those codes1, but as you do when you get something for free, you tend to look what’s up for grabs. As the shortest, non-grandfathered, domain names you seem to be able to get are three letters long, I tried a few for .xyz and to my surprise I saw that the corresponding .lu was free.

That was a surprise. I’d have expected that most, if not all, three letter .lu domains would be taken. So I decided to investigate. A quick one-liner pounded the whois servers, and, well, I got banned quite quickly at my work IP address. I should have foreseen that. You might have seen a Facebook status about it, and someone suggested to first look whether there are DNS records2 and, then, and only then do the whois checks3. I decided to do exactly that and I ended up with 14291 three letter domains that have no valid DNS entries. That’s an amazingly a small amount. There are 26×26×26 = 17576 possibilites4, which means only 19% of all three letter .lu domain names have DNS entries.

Now, what? That’s way too much for bulk querying the whois servers and I had no desire to get my home IP blacklisted. My plan was to do one whois every 20 minutes, but that would make nearly 200 days. I decided to go manually over the list and pick the ones that caught my eye. I’m human, I get bored, so that’s probably why I selected more at the beginning of the alphabet. Anyway, I selected 87 domains for investigation and it turned out that 71 of those were not registered. Some examples (but really, just a few):

  • ado.lu : “ado” is French for teenager.
  • aes.lu : Advanced Encryption Standard. Neat to have as nerd.
  • asm.lu : Nobody in the demo scene got this? Seriously?
  • foo.lu, bar.lu, and baz.lu : Yes, you can still have the full metasyntactic-variable sequence. That “bar.lu” is isn’t taken, is simply amazing.
  • bbw.lu : I am so tempted to get this one.
  • bid.lu : For an auction site?
  • fac.lu : In French “la fac” is pretty much the colloquial equivalent of university.
  • fkk.lu : The Germans will understand.
  • gnu.lu : All hail Richard Stallmann!
  • jiz.lu : If you don’t know why, you need to have your perversion levels adjusted.
  • jts.lu : Ok, this one only means something to me. Online I get referred to as JTS. I don’t know when people started to do that, but I guess it’s because “jawtheshark” is too long.
  • nan.lu : Not a number. Another nerdy one.
  • pdp.lu : Neeeeerd! You should also take vms.lu, which is also available.
  • pie.lu : The cake is a lie, but the pie isn’t.
  • ocr.lu : Optical character recognition. I could see value in this if you’re in document management.
  • raw.lu : Calling the photography nerds… or for weird porn.
  • tit.lu : Again, I’m so tempted to take this one.
  • xen.lu : I should get this one, just for when I need to go freelance and want to offer virtualization services.
  • zzz.lu : Because I really got sleepy after going through so many domain names.

You can get the full list of the ones I verified as “not registerd”. (List without DNS entries) A .lu is free to register for everyone, worldwide and costs about 25€ per year.


Addenum
Apparently, while creating this post, I opened up the wrong list, namely the DNS verified one. My mistake. A few listed here are not free and haven’t been for a while. Those are foo.lu and bar.lu. No metasyntactic-variables for you. Sorry.


1I could add a few to my “free-for-friends” dynamic DNS. For now you can only get a subdomain of ipv4.lu.
2 Script used: for domain in `echo {a..z}{a..z}{a..z}`; do if [[ -n `host $domain.lu | grep NXDOMAIN` ]]; then echo $domain.lu; fi; done > threeletters.txt
3 Script used: for domain in `cat selected-domains.txt` ; do QUERY=`whois ${domain} | grep "% No such domain"` ; if [[ -n "${QUERY}" ]]; then echo ${domain} is free ; fi ; sleep 1200 ; done > available-threeletter.txt
4 Ignoring numbers, which would expand the search space a bit more.

Five years…

A filled Bofferding glass

A filled Bofferding glass by Thomas Heijting
Creative commons license, found on WikiMedia

I’m a bit late, but it’s still the 1st of March 2015 somewhere on this world at the time of this writing, so I’ll count it as “posted on the right day”. It’s a day I remember every year. Five years ago, I had my last hangover. I can of course never guarantee that there won’t ever be one again, but for now, I haven’t had a drink in 1826 days. If you write that number down, you realize that’s not that much.

It’s also nothing much to be proud of, because frankly, I’d trade in a heartbeat with all of you people, who can have two drinks and say “it’s enough for tonight” and manage to stick to it. I also don’t really think congratulations are in order, as one misstep, and I’m back there on the floor stark-drunk as I was the 28th February 2010. It’s not a nice thought, but I have to keep it in mind every day. Not that there is much craving, that has gone long ago.

What also is not the right thing to say is “I couldn’t do that”. It’s fake admiration. Either you can, but don’t want to (and thus you’re lying), or you cannot for real and then you are like me. Either way, it’s one of the things I don’t want to hear.

So, all of you who can drink, raise a glass on me and get a buzz. It’s all I ask.

Update on the Debian kernel bug.

It might not be caused by the kernel, but by the Xen hypervisor.  What I did up to know:

  • I installed the problematic kernel in a virtual machine (DomU) while the host (Dom0) was running Jessie and a thus different kernel.  Within that environment, no problem occurred.
  • I reinstalled Wheezy on the machine, but this time, I did not install Xen and did exactly the same dd command.  The problem did not arise.  (I also simplified the disk setup and upgraded my BIOS to the latest version, for good measure.  It shouldn’t make much of a difference, I was surprised there was a new BIOS in the first place)
  • Being confident, it might perhaps be caused by the disk setup (Originally I had 4 disk raid6 with one spare, and now I simply have a 2-disk raid1, with no spares), I installed Xen and rebooted.  When I tried the dd command, I got my Oops.

Conclusion:  the error only seems to occur on a Dom0 while using Xen.  It can be avoided by upgrading to Jessie.

While that is good news for the new setup, it still implies that under no circumstances, I can reboot hammerhead before mako is ready.  It might of course be linked to AMD specific code, but I’m really not willing to take that risk.

I wonder if I should file a bug with the Debian kernel team.

Recently, I decided to reinstall my old self-built rack server (AMD A6-3650, 16GB RAM, Asus F1A75-V PRO)  It wasn’t really being used and since I want to reconfigure my Dell R210-II, I decided the AMD should, at least temporary, take over the Dells tasks.  Yes, I know it’s not real server hardware, and yes, I think of buying another R2xx when I’ve got a bit money to waste, which is not now.

So, I installed Debian Wheezy and the Xen Hypervisor on it, as always using PXE, which means you end up with an installation that is fully up-to-date, unlike my other machines who tend to have older kernels because I rarely see a reason to reboot.

Then, one of the first things I tried was to clone a disk over network ( dd if=”/dev/vg0/vm-root” | ssh root@mako “dd if=/dev/vg0/vm-root”).  I have done these things before, and I know they work.  It’s not the quickest way, but I had my reasons to do as such.  Thing is: I got a kernel oops.  While only a “oops”, it does make the system unstable so a reboot is truly recommended.

I thought it would perhaps be a fluke, so I tried again… Same thing, so I removed the networking component and tried a simple dd if=/dev/zero of=/dev/vg0/big-lv bs=1073741824 and, yes again a kernel oops.  It looks something like this:

Feb 11 23:00:08 mako kernel: [ 8450.177200] BUG: unable to handle kernel paging request at ffff88013f800000
Feb 11 23:00:08 mako kernel: [ 8450.177222] IP: [<ffffffff811b3e27>] clear_page_c+0x7/0x10
Feb 11 23:00:08 mako kernel: [ 8450.177237] PGD 1606067 PUD bdd89067 PMD bdf86067 PTE 0
Feb 11 23:00:08 mako kernel: [ 8450.177256] Oops: 0002 [#1] SMP 
Feb 11 23:00:08 mako kernel: [ 8450.177268] CPU 2 
Feb 11 23:00:08 mako kernel: [ 8450.177272] Modules linked in: fuse btrfs crc32c libcrc32c zlib_deflate ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs reiserfs ext3 jbd ext2 efivars xen_gntdev xen_evtchn xenfs nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc bridge stp loop radeon ttm drm_kms_helper eeepc_wmi snd_hda_codec_hdmi psmouse asus_wmi sparse_keymap snd_hda_intel snd_hda_codec rfkill drm snd_hwdep snd_pcm powernow_k8 mperf pl2303 snd_page_alloc power_supply serio_raw pcspkr i2c_piix4 evdev snd_timer k10temp wmi snd usbserial soundcore button processor thermal_sys ext4 crc16 jbd2 mbcache dm_mod raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq md_mod usbhid hid sg sd_mod crc_t10dif r8169 mii ohci_hcd ahci libahci xhci_hcd ehci_hcd libata igb i2c_algo_bit i2c_core dca scsi_mod usbcore usb_common [last unloaded: scsi_wait_scan]
Feb 11 23:00:08 mako kernel: [ 8450.177604] 
Feb 11 23:00:08 mako kernel: [ 8450.177610] Pid: 4221, comm: sshd Not tainted 3.2.0-4-amd64 #1 Debian 3.2.65-1+deb7u1 System manufacturer System Product Name/F1A75-V PRO
Feb 11 23:00:08 mako kernel: [ 8450.177630] RIP: e030:[<ffffffff811b3e27>]  [<ffffffff811b3e27>] clear_page_c+0x7/0x10
Feb 11 23:00:08 mako kernel: [ 8450.177647] RSP: e02b:ffff8801f1f17b30  EFLAGS: 00010246
Feb 11 23:00:08 mako kernel: [ 8450.177658] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000200
Feb 11 23:00:08 mako kernel: [ 8450.177669] RDX: ffffea00045e4000 RSI: 0000000000000000 RDI: ffff88013f800000
Feb 11 23:00:08 mako kernel: [ 8450.177681] RBP: ffffea00045e4000 R08: 0000000000000000 R09: 00000000000401d7
Feb 11 23:00:08 mako kernel: [ 8450.177693] R10: 0000000000000002 R11: 0000000000000fc4 R12: 0000000000000000
Feb 11 23:00:08 mako kernel: [ 8450.177704] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8801f1f16000
Feb 11 23:00:08 mako kernel: [ 8450.177718] FS:  00007f7904fce7c0(0000) GS:ffff8803cb500000(0000) knlGS:0000000000000000
Feb 11 23:00:08 mako kernel: [ 8450.177734] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
Feb 11 23:00:08 mako kernel: [ 8450.177745] CR2: ffff88013f800000 CR3: 000000014239e000 CR4: 0000000000000660
Feb 11 23:00:08 mako kernel: [ 8450.177757] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 11 23:00:08 mako kernel: [ 8450.177769] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb 11 23:00:08 mako kernel: [ 8450.177781] Process sshd (pid: 4221, threadinfo ffff8801f1f16000, task ffff8803b356e100)
Feb 11 23:00:08 mako kernel: [ 8450.177796] Stack:
Feb 11 23:00:08 mako kernel: [ 8450.177804]  ffffffff810bb8cd ffff8803cb515628 ffffea00045e4000 0000000000000000
Feb 11 23:00:08 mako kernel: [ 8450.177830]  00000001000280da ffffffff00000041 00000003caf73025 ffff8803cb72ac08
Feb 11 23:00:08 mako kernel: [ 8450.177856]  ffff8803cb72ac00 0000000081004f2f 0000000000000030 ffff8803cb72ac08
Feb 11 23:00:08 mako kernel: [ 8450.177882] Call Trace:
Feb 11 23:00:08 mako kernel: [ 8450.177894]  [<ffffffff810bb8cd>] ? get_page_from_freelist+0x57a/0x665
Feb 11 23:00:08 mako kernel: [ 8450.177907]  [<ffffffff810bbb3e>] ? __alloc_pages_nodemask+0x186/0x7ab
Feb 11 23:00:08 mako kernel: [ 8450.177921]  [<ffffffff810d1a97>] ? handle_pte_fault+0x298/0x79f
Feb 11 23:00:08 mako kernel: [ 8450.177933]  [<ffffffff81004e44>] ? pte_pfn_to_mfn+0x26/0x77
Feb 11 23:00:08 mako kernel: [ 8450.177945]  [<ffffffff8100569f>] ? __xen_set_pte+0x11/0x51
Feb 11 23:00:08 mako kernel: [ 8450.177957]  [<ffffffff810e6ee9>] ? alloc_pages_vma+0x12d/0x136
Feb 11 23:00:08 mako kernel: [ 8450.177969]  [<ffffffff810d1964>] ? handle_pte_fault+0x165/0x79f
Feb 11 23:00:08 mako kernel: [ 8450.177981]  [<ffffffff810cefaf>] ? pmd_val+0x7/0x8
Feb 11 23:00:08 mako kernel: [ 8450.177992]  [<ffffffff810cf02d>] ? pte_offset_kernel+0x16/0x35
Feb 11 23:00:08 mako kernel: [ 8450.178005]  [<ffffffff81353e74>] ? do_page_fault+0x320/0x345
Feb 11 23:00:08 mako kernel: [ 8450.178018]  [<ffffffff81095461>] ? arch_local_irq_save+0x11/0x15
Feb 11 23:00:08 mako kernel: [ 8450.178029]  [<ffffffff81095e17>] ? __call_rcu+0x21/0x12c
Feb 11 23:00:08 mako kernel: [ 8450.178041]  [<ffffffff8110b26f>] ? dput+0x27/0xee
Feb 11 23:00:08 mako kernel: [ 8450.178052]  [<ffffffff810fc21e>] ? fput+0x17a/0x1a1
Feb 11 23:00:08 mako kernel: [ 8450.178063]  [<ffffffff810eb3fb>] ? arch_local_irq_restore+0x7/0x8
Feb 11 23:00:08 mako kernel: [ 8450.178074]  [<ffffffff81351415>] ? page_fault+0x25/0x30
Feb 11 23:00:08 mako kernel: [ 8450.178084] Code: 20 4c 89 4c 24 48 c7 44 24 08 10 00 00 00 48 89 44 24 18 e8 8c f9 ff ff 48 83 c4 58 c3 90 90 90 90 90 90 90 b9 00 02 00 00 31 c0 <f3> 48 ab c3 0f 1f 44 00 00 b9 00 10 00 00 31 c0 f3 aa c3 66 0f 
Feb 11 23:00:08 mako kernel: [ 8450.178270] RIP  [<ffffffff811b3e27>] clear_page_c+0x7/0x10
Feb 11 23:00:08 mako kernel: [ 8450.178283]  RSP <ffff8801f1f17b30>
Feb 11 23:00:08 mako kernel: [ 8450.178291] CR2: ffff88013f800000
Feb 11 23:00:08 mako kernel: [ 8450.178436] ---[ end trace c0e1c75d9283be10 ]---
Feb 11 23:00:08 mako kernel: [ 8450.178466] note: sshd[4221] exited with preempt_count 1
Feb 11 23:00:08 mako kernel: [ 8450.178971] BUG: scheduling while atomic: sshd/4221/0x10000001
Feb 11 23:00:08 mako kernel: [ 8450.179002] Modules linked in: fuse btrfs crc32c libcrc32c zlib_deflate ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs reiserfs ext3 jbd ext2 efivars xen_gntdev xen_evtchn xenfs nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc bridge stp loop radeon ttm drm_kms_helper eeepc_wmi snd_hda_codec_hdmi psmouse asus_wmi sparse_keymap snd_hda_intel snd_hda_codec rfkill drm snd_hwdep snd_pcm powernow_k8 mperf pl2303 snd_page_alloc power_supply serio_raw pcspkr i2c_piix4 evdev snd_timer k10temp wmi snd usbserial soundcore button processor thermal_sys ext4 crc16 jbd2 mbcache dm_mod raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq md_mod usbhid hid sg sd_mod crc_t10dif r8169 mii ohci_hcd ahci libahci xhci_hcd ehci_hcd libata igb i2c_algo_bit i2c_core dca scsi_mod usbcore usb_common [last unloaded: scsi_wait_scan]
Feb 11 23:00:08 mako kernel: [ 8450.181496] Pid: 4221, comm: sshd Tainted: G      D      3.2.0-4-amd64 #1 Debian 3.2.65-1+deb7u1
Feb 11 23:00:08 mako kernel: [ 8450.181530] Call Trace:
Feb 11 23:00:08 mako kernel: [ 8450.181560]  [<ffffffff8134a2be>] ? __schedule_bug+0x3e/0x52
Feb 11 23:00:08 mako kernel: [ 8450.181591]  [<ffffffff8134f4a5>] ? __schedule+0x85/0x610
Feb 11 23:00:08 mako kernel: [ 8450.181621]  [<ffffffff8110b26f>] ? dput+0x27/0xee
Feb 11 23:00:08 mako kernel: [ 8450.181652]  [<ffffffff81042090>] ? __cond_resched+0x1d/0x26
Feb 11 23:00:08 mako kernel: [ 8450.181682]  [<ffffffff8134fa7f>] ? _cond_resched+0x12/0x1c
Feb 11 23:00:08 mako kernel: [ 8450.181713]  [<ffffffff81049a2a>] ? put_files_struct+0x65/0xad
Feb 11 23:00:08 mako kernel: [ 8450.181743]  [<ffffffff8104a02c>] ? do_exit+0x292/0x713
Feb 11 23:00:08 mako kernel: [ 8450.181774]  [<ffffffff8107130f>] ? arch_local_irq_disable+0x7/0x8
Feb 11 23:00:08 mako kernel: [ 8450.184239]  [<ffffffff81071307>] ? arch_local_irq_restore+0x7/0x8
Feb 11 23:00:08 mako kernel: [ 8450.184271]  [<ffffffff81350e3f>] ? _raw_spin_unlock_irqrestore+0xe/0xf
Feb 11 23:00:08 mako kernel: [ 8450.184304]  [<ffffffff81048345>] ? kmsg_dump+0x52/0xdd
Feb 11 23:00:08 mako kernel: [ 8450.184336]  [<ffffffff81350e3f>] ? _raw_spin_unlock_irqrestore+0xe/0xf
Feb 11 23:00:08 mako kernel: [ 8450.184368]  [<ffffffff81351d14>] ? oops_end+0xb1/0xb6
Feb 11 23:00:08 mako kernel: [ 8450.184399]  [<ffffffff81349d8b>] ? no_context+0x1ff/0x20e
Feb 11 23:00:08 mako kernel: [ 8450.184430]  [<ffffffff81349619>] ? pmd_val+0x7/0x8
Feb 11 23:00:08 mako kernel: [ 8450.184460]  [<ffffffff81349638>] ? pte_offset_kernel+0x16/0x35
Feb 11 23:00:08 mako kernel: [ 8450.184491]  [<ffffffff81353d0a>] ? do_page_fault+0x1b6/0x345
Feb 11 23:00:08 mako kernel: [ 8450.184522]  [<ffffffff81004e44>] ? pte_pfn_to_mfn+0x26/0x77
Feb 11 23:00:08 mako kernel: [ 8450.184553]  [<ffffffff81004375>] ? __raw_callee_save_xen_make_pte+0x11/0x1e
Feb 11 23:00:08 mako kernel: [ 8450.184584]  [<ffffffff81351415>] ? page_fault+0x25/0x30
Feb 11 23:00:08 mako kernel: [ 8450.184615]  [<ffffffff811b3e27>] ? clear_page_c+0x7/0x10
Feb 11 23:00:08 mako kernel: [ 8450.184646]  [<ffffffff810bb8cd>] ? get_page_from_freelist+0x57a/0x665
Feb 11 23:00:08 mako kernel: [ 8450.184677]  [<ffffffff810bbb3e>] ? __alloc_pages_nodemask+0x186/0x7ab
Feb 11 23:00:08 mako kernel: [ 8450.184709]  [<ffffffff810d1a97>] ? handle_pte_fault+0x298/0x79f
Feb 11 23:00:08 mako kernel: [ 8450.184739]  [<ffffffff81004e44>] ? pte_pfn_to_mfn+0x26/0x77
Feb 11 23:00:08 mako kernel: [ 8450.184770]  [<ffffffff8100569f>] ? __xen_set_pte+0x11/0x51
Feb 11 23:00:08 mako kernel: [ 8450.184800]  [<ffffffff810e6ee9>] ? alloc_pages_vma+0x12d/0x136
Feb 11 23:00:08 mako kernel: [ 8450.184831]  [<ffffffff810d1964>] ? handle_pte_fault+0x165/0x79f
Feb 11 23:00:08 mako kernel: [ 8450.184862]  [<ffffffff810cefaf>] ? pmd_val+0x7/0x8
Feb 11 23:00:08 mako kernel: [ 8450.184892]  [<ffffffff810cf02d>] ? pte_offset_kernel+0x16/0x35
Feb 11 23:00:08 mako kernel: [ 8450.184922]  [<ffffffff81353e74>] ? do_page_fault+0x320/0x345
Feb 11 23:00:08 mako kernel: [ 8450.184954]  [<ffffffff81095461>] ? arch_local_irq_save+0x11/0x15
Feb 11 23:00:08 mako kernel: [ 8450.184984]  [<ffffffff81095e17>] ? __call_rcu+0x21/0x12c
Feb 11 23:00:08 mako kernel: [ 8450.185014]  [<ffffffff8110b26f>] ? dput+0x27/0xee
Feb 11 23:00:08 mako kernel: [ 8450.185044]  [<ffffffff810fc21e>] ? fput+0x17a/0x1a1
Feb 11 23:00:08 mako kernel: [ 8450.185074]  [<ffffffff810eb3fb>] ? arch_local_irq_restore+0x7/0x8
Feb 11 23:00:08 mako kernel: [ 8450.185105]  [<ffffffff81351415>] ? page_fault+0x25/0x30

Okay, a Debian stable kernel causing kernel oopses?  Nah, can’t be…  Damn, probably the memory is broken.  As such, an overnight memtestx86+ is scheduled and in the morning, it tells me everything is just fine.

At this point, I worry that it truly is a kernel bug.  I verify my other machines and none of them run 3.2.65-1+deb7u1, but all of them have it installed already.  Unlike Ubuntu, Debian doesn’t seem to amass old kernels in /boot.  I would have tried using an older kernel, but somehow I didn’t find the magic apt invocation to do so.

I still wanted to verify whether it’s the kernel causing this, so I upgraded the AMD machine to Jessie.  After I did so, I tried the same tests as on the original install and it works exactly as expected.  No more oops.

I also realize that my other machines are a reboot away from instability!  Scary thought.  Now, I’ll probably just trash the system, try wheezy again and see whether the problem comes back.  If so, it must be kernel bug.  The question is whether I should report it to the Debian kernel team.  I’m not sure I can really help them, also I could just ignore it and go Jessie (getting rid of systemd isn’t all that hard on a server as I found out today).