May 13

Linux is blessed with a variety of methods for accessing, expanding and building dynamic, scalable storage solutions. And they’re not just for industry experts either – even everyday Linux users can get access to a great deal of functionality by sharing and accessing data across a network.

Set up iSCSI and your data will stream from your PC like a magical rainbow. Perhaps. If you’re very lucky.

But there’s a new generation of solutions that can increase performance, offer more features and improve security – one of which is iSCSI. As its name implies, iSCSI is a version of the SCSI protocol, which is responsible for shuffling data between various SCSI devices. The key difference with iSCSI is that instead of transporting data across local cables and buses, the ‘i’ takes that data across the internet or your local network, attaching remote storage devices to your local system. It’s perfect for storage area networks (SANs) in particular, and you’ll find network administrators in enterprises of all sizes singing the praises of iSCSI’s ability to combine big-business fibre channel commands with generic networking hardware, saving their departments thousands of pounds in both hardware and infrastructure costs. iSCSI is also a great general-purpose remote storage solution that could easily replace the NFS protocol.

Understanding iSCSI

You can now find iSCSI in many network-attached storage (NAS) devices for the home and even standard Linux installations. This is because it’s a great storage protocol for an expanding network. It might be hard work to master, but you can build a working configuration with relative ease. The great advantage that iSCSI has over some of the alternatives is that drives are exported as block devices, just as blocks of data would be transported over an old SCSI cable. This means that, to the Linux kernel, these drives are handled exactly like local block devices. This makes it perfect for connecting the storage area of a database to the client application or, more recently, the virtual storage devices used by VMware and VirtualBox virtual machines. It’s also ideal if you want to run a machine without any local storage – traditionally the domain of NFS.

Use ‘apt-get install iscsitarget’ to add the appropriate packages to your installation.

Before getting started, your first step should be to familiarise yourself with some of the concepts that are used by iSCSI. These appear to take their names from the Terminator films, with the two most important being the Initiator and the Target. The Initiator is the iSCSI equivalent of the client. It’s the machine you want to have access to the remote data – the one that’s running the applications, or your desktop. The Target is the place the Initiator grabs the data from – another machine running the iSCSI server software and managing the requests to and from the storage medium. You’ll often find NAS drives running the Target server, for example, and you’ll only need to run the Initiator on your local machine in order to access the Target drive.

Set up the Target

It’s possible to share a large variety of different storage types over iSCSI, but the easiest to configure are entire drives. iSCSI connects devices at the block level, which means the job of partitioning and formatting can be left to the Initiator rather than the machine that’s attached directly to the hard drive. On our system, the drive we used was listed as /dev/sdb, and we’ll stick with this example throughout this tutorial. The system drive is normally /dev/sda. To find out what yours is, type fdisk -l on the command line for an overview of what’s connected to your system.

Edit the configuration file in nano. You’ll need to add a line which ensures iSCSI is switched on.

On the Target machine, you first need to install the ‘iscsitarget’ package. You can do this either from the Synaptic package manager or by typing sudo apt-get install iscsitarget on the command line. This will also install several configuration files, and the first of these that we need to look at is ‘/etc/default/iscsitarget’. You can type sudo nano followed by the path to the file name to edit it on the command line, or use your favourite desktop editor if you prefer. We only need to make a single edit here – making sure ‘ISCSITARGET_ENABLE=true’ is the only line in the file. Nano users need to press [Ctrl]+[X], then [Y] to save the changes.

Edit configuration files

The next file we need to edit is ‘/etc/ietd.conf’, which you’ll need to open in an editor. This document contains a working configuration, with lines using the # symbol designated as comments and therefore out of action.

The first line we need to edit defines what’s known as the iSCSI qualified name, or IQN for short. Just search for Target, followed by iqn. As with pages on the net, this IQN name has to be unique to your installation, and it takes the form of ‘iqn’, followed by the year and month, then a reversed version of your domain name. This is then followed by a colon and a reference to whatever you’re going to call the target storage device. This can be anything you like. Here’s what we chose:

iqn.2010-01.com.example:storage.disk2

Next, we need to define the drive that’s going to be shared over iSCSI. Remove the # symbol from the line starting with ‘Lun’ and modify it to read Lun 0 Path=/dev/sdb,Type=fileio. You need to change ‘/dev/sdb’ to the location of the drive that you’ve decided to share. Next, uncomment the Alias line at the bottom of the current section and save the file.

You’ll need to do some deeper editing to the ‘/etc/ietd.conf’ file.

You may have noticed that there are two lines available for a username and password, but we’re going to leave these untouched for now because we’re running our iSCSI device over a trusted network. You can always come back to this point and change the configuration after you’ve got the basic connection working, if you think that you need to.

To enable the device to be shared, open ‘/etc/initiators.allow’ and add iqn.2010-01.com.example:storage.disk2 ALL. When you get the connection working, you’ll need to change ‘ALL’ to the IP address of the machines allowed to access your iSCSI device, but for now we’re trying to remove all obstacles that could stop us getting the connection working. Start the Target server by typing sudo /etc/init.d/iscsitarget start.

Set up the Initiator

It’s now the turn of the other machine, the Initiator. To begin, you need to install the ‘open-iscsi’ package. Once that’s done, open the ‘/etc/iscsi/iscsid.conf’ configuration file and look for the line starting with ‘node.startup’. Change the default value of ‘manual’ to automatic, save the file and restart the service by typing sudo /etc/init.d/open-iscsi restart.

Make sure you change the node startup option to ‘manual’.

Almost everything is now configured and ready to go. Our next step is to probe the Target machine to see what storage services it offers, hopefully listing our drive in the process. Type iscsiadm -m discovery -t st -p, followed by the IP address of the Target. You can find this by typing ipconfig on the Target machine. If everything is set up correctly, you should see the IQN of your drive as output to the iscsiadm. This is the output we received:

iscsiadm -m discovery -t st -p 192.168.1.61
192.168.1.61:3260,1 iqn.2010-01.com.example:storage.disk2

Now you know that the Target machine is correctly configured and that iSCSI can see the remote storage device, you need to type iscsiadm -m node. This will automatically create configuration files within /var/lib/iscsi/nodes for the storage unit on your local machine, which will allow the Initiator to mount the device automatically when the service is restarted. You may need to do this manually by typing sudo /etc/init.d/open-iscsi restart.

The remote drive should now be mounted locally. The best way to check is to type fdisk -l to list all the storage devices attached to your machine. The iSCSI drive should be part of the output, but you won’t see any indication that it’s being connected over a network. That’s the great advantage of using iSCSI.

Mount the drive

iSCSI passes the block information, so you need to format the remote drive from the Initiator machine before it can be used. This process is exactly the same as partitioning and formatting any other drive on your system. You could use fdisk on the command line, for example, or you could take the easy route and use a graphical partitioning tool like GParted, which you can install through your distro’s package manager.

After launching GParted, you first need to select the remote drive from the dropdown list in the top-right of the main window. As with fdisk, the iSCSI drive looks exactly like a local drive, so you need to take special care to select the correct device. You can lose data from other drives permanently with GParted, so proceed with caution.

If you’re hooked up to your target drive you should be able to treat it like any locally connected module – including partitioning it.

With the remote drive selected, click on the large grey area in the middle of the window marked ‘unallocated’. This is the unpartitioned area on your remote drive. Click on ‘New’. By default GParted will use the entire disk, but you’re free to subdivide the remote drive in exactly the same way you would a local one if you prefer. Leave ‘Primary Partition’ selected, then select ‘ext4’ as the filesystem and give your drive a meaningful label before clicking ‘Add’. Click on ‘Apply’ to make the changes to the remote drive, and then simply sit back and wait while the formatting process finishes.

Now that the drive is correctly formatted and partitioned, it’s time to mount it onto the local filesystem so you can start reading and writing data to it. You can use the ‘mount’ command to attach the drive just as you would any other, but there’s one small difference that you have to consider whenever you use multiple iSCSI devices – they might not always have the same /dev path. The solution is to navigate to the device using the /dev/disk/by-path/ nodes, so you can be sure you’re getting the same disk every time.

If you type ls /dev/disk/by-path, for example, you can easily see which devices are using the IQN address. In our example, we mounted the remote drive onto the local /mnt/iscsi folder with the following commands:

mkdir /mnt/iscsi
mount /dev/disk/by-path/ip-192.168.1.61\:3260-iscsi-iqn.2010-01.com.example\:storage.disk2-lun-0-part1 /mnt/iscsi

You can now read and write files to the /mnt/iscsi mount point, and these are passed directly to the remote drive just as if they were connected using an extremely long SCSI cable.

iSCSI on NAS boxes

Two of the trickiest parts of using iSCSI are finding the hardware and configuring the Target machine, but luckily there may be another option if you have a Linux-based network-attached storage box. Many of these will offer up a chunk of the storage inside the box over an iSCSI connection, and you can usually activate and configure this facility with just a couple of clicks. Some of these will let multiple Initiators access a single target and enable you to add CHAP password authentication. Then it’s simply a case of running the iSCSI discovery procedure on your Initiator hardware. If you’ve chosen to use a password and username, you’ll need to run the ‘iscsiadm’ command on the Initiator to add those values to the configuration file for that Target. The command takes the following format:

iscsiadm -m node --targetname IQN --portal IP AND PORT OF TARGET --op=update --name node.session.auth.

You need to run this three times, changing the end parameters each time. With the first execution, add authmethod –value=CHAP to the authentication check. For the second, add username –value=username to specify the username for the connection, and for the third execution add password –value=password to specify the password.

Create a virtual hard drive

Many more technical distributions, such as Fedora, use logical volume management (LVM) for local storage and filesystems. One of the advantages of LVM is that you can shrink, expand and create partitions on the fly from the pool of storage on your machine. Logical volumes can be used for all sorts of tasks. Virtual machines often use them to keep data from the main system, and they’re handy if you enjoy playing with filesystems. They’re also useful if you want to try iSCSI, because you don’t need to have a spare hard drive – you just need the space within your logical volume pool.

The key to adding new virtual drives is the ‘lvcreate’ command. We used the following command to create a 10GB logical volume: lvcreate -L10G -n vdrive vg. This creates a volume called ‘vdrive’ in a volume group called ‘vg’. You’ll need to take a look in your /dev directory to discover the name of the logical volume group used by your installation. After creating the drive, it appears in the /dev volume tree just like any other device and you can share it across the iSCSI connection like a real hard drive.

Even without LVM, there are other options for dynamically shared storage. You could create an image file, for example, by typing dd if=/dev/zero of=/mnt/iscsi.img bs=1024k count=1000. The ‘count’ value is the size of the image, while ‘/mnt/iscsi.img’ is the file that’s created. You can use that path as the source for the iSCSI Target on the ‘Lun 0’ line in the ‘/etc/ietd.conf’ configuration file, and use it like a real partition.

Virtualisation

iSCSI is commonly used by cloud applications and the world of virtualisation. This is so you can access the same hard drive data regardless of where the virtual machine is running. If you’re in the market for a major, enterprise-grade virtualisation solution that uses iSCSI, take a look at VMware’s ESXi solution, which is available for free from www.vmware.com.

Boot options

If you want the Target iSCSI device to be available each time you boot the Initiator machine, you will need to add the remote device to the ‘/etc/fstab’ file. The quickest way to do this is to copy another line in the file and change the parameters to suit the iSCSI device. Make sure the path uses the ‘by-name’ format so that you can be sure you get the same drive at each boot.

Apr 26

No crime, no lag, no malware: 2020′s internet sounds like heaven. PC Plus checks out its foundations.

Safe, secure and speedy: that’s the internet of 2020. In a decade’s time, the web will be a very different place. There will be no crime, no malware and no fake online banking sites. Latency won’t be a problem. High-definition video will be smooth, and buffering will be a distant, nightmarish memory.

And that’s not all. The internet will have grown dramatically, making room for a new generation of connected devices: cars, phones, TVs, everything. Super-fast speeds are the rule, not the exception. To borrow a phrase, it just works.

At least, that’s what we hope the web will be like. To make it happen, engineers merely need to rethink the way the internet works and change pretty much everything. What could be simpler? Some big changes are already in progress. The explosion of internet-
enabled devices means that we’re running out of IP addresses even more quickly than expected: RIPE NCC’s Managing Director Axel Pawlik noted in January that the pool of unassigned IPv4 addresses would run out as early as 2011. But the move to IPv6, which can handle around “a trillion trillion trillion” addresses – 3.4×1038 if you’re feeling pedantic – is largely a software, not hardware, issue. “In most cases it’s very easy to reprogram connectivity software on a chip to ensure a device is IPv6 compatible,” Pawlik says.

But things aren’t progressing as straightforwardly as you would think. “Despite the simplicity of ensuring compatibility, widespread IPv6 take-up has so far been slow, and many of the best known digital devices available today, including the iPhone, do not yet support the next generation of IP addressing,” warns Pawlik. That lack of urgency is disappearing fast, with big names like Google implementing IPv6 support, router firms embracing the new system and new operating systems – including Windows and OS X – supporting it.

If we’re late embracing IPv6, the internet won’t grind to a halt – existing IP addresses will keep working – but as the European Commission reports, “the growth and also the capacity for innovation in IP-based networks would be hindered”. The EU is pushing IPv6 hard, and it expects European ISPs and “the top 100 European sites” to be IPv6-enabled this year.

As a happy by-product of IPv6, widespread adoption will make the internet more secure too. The IPsec security protocol is a compulsory part of IPv6, which means all IPv6 communications can be encrypted and authenticated.

Route masters

We’re using the internet in ways its creators couldn’t possibly have imagined, from the rise of video to the sheer number of connected devices. We’re constantly pushing the internet’s capacity, stability and security, and inevitably cracks are beginning to show.

Aaron Falk is the Chair of the Internet Research Task Force (IRTF) and Engineering Lead with the Global Environment for Network Innovations (GENI). “There are many areas where the current architecture is straining to meet the needs of the users,” he says. “In particular, the areas of mobility, security, and network management were not well addressed in the original architecture, leading to a patchwork of mechanisms. The greatest concern is not so much that today’s traffic is challenged but that the ad-hoc machinery being inserted into the network will inhibit future innovations. I worry about tomorrow’s applications more than today’s.”

The IRTF is a technological trouble-shooter for internet architecture, as Falk explains: “The IRTF hosts research groups that work in areas ‘adjacent’ to the IETF (Internet Engineering Task Force). This can be pre-standards technologies, hard problems that emerge from the IETF or operations communities, technologies where the internet may be one of many possible communications strategies, or architectural issues.”

He continues: “Sometimes research groups assist IETF working groups by bringing researcher expertise or otherwise ‘pre-baking’ technologies so they are ready for standardisation. For example, the Mobility Optimizations Research Group has been working on IP mobility solutions that feed into the MIPSHOP (Mobility for IP: Performance, Signalling and Handoff Optimization) working group for standardisation. Another example is the IRTF Research Group on Internet Congestion Control (ICCRG) which evaluates new congestion control proposals that arise in the IETF.”

I dream of GENI

One of the problems with the current web is that it’s too big and too important to muck around with. That’s where GENI comes in. The Global Environment for Network Innovations is funded by the US National Science Foundation, and it’s best described as a (serious) playground where new ideas can be tested out. “GENI will support two major types of experiments,” the organisation says. “Controlled and repeatable experiments, which will greatly help improve our scientific understanding of complex, large-
scale networks, and ‘in the wild’ trials of experimental services that ride atop or connect to today’s internet and that engage large numbers of human participants.

“We’re well underway on the second year of GENI prototyping, GENI Spiral 2,” Falk says. “One of our more exciting activities is what we are calling ‘meso-scale deployments’ of virtualisable, programmable routers, switches, and WiMax base stations on 14 campuses and two national research backbone networks. Deployments like these are particularly exciting because they’ll allow experimental applications and services built on GENI to directly reach real users on university campuses. Thus researchers will have the ability to build new services – perhaps incompatible with the current internet – and test them at-scale with real end-users.” One area of concern is routing tables, which the net’s backbone routers use to direct online traffic. The BGP (border gateway protocol) routing table has grown hugely, doubling in size between 2003 and 2009, and there are concerns that if the level of growth continues, router hardware won’t be able to cope. The IRTF’s Routing Research Group (RRG) is investigating alternatives, and its goal is to produce solid recommendations that the IETF can implement. Another related program is Rochester Institute of Technology’s Floating Cloud initiative, which hopes to address the problem of routing table growth by moving the routing tables from inside routers to network clouds. Initial testing took place on a dozen Linux boxes, and the next step is to try it on GENI.

The BGP routing table doubled in size between 2003 and 2009, and it’s still getting bigger.

GENI isn’t the only initiative that the NSF is helping to fund. Its Future Internet Architectures (FIA) program is offering $30million to fund projects that will transform the net. As the NSF puts it: “Proposals should not focus on making the existing internet better through incremental changes, but rather should focus on designing comprehensive architectures that can meet the challenges and opportunities of the 21st century.”

FIA is a continuation of FIND, the NSF’s Future Internet Design project. FIND asked researchers to redesign the internet from scratch, and FIA will narrow around 50 FIND projects down to two, three or four serious contenders.

Safety and security

With the existing internet, security is something that’s largely been bolted on as an afterthought – but the FIA program expects security to be a key consideration from the outset. That’s leading to some interesting ideas, including one security system that takes its cues from Facebook. Davis Social Links (DSL) adds a “social control layer” to the network that identifies you not by your IP address but by your social connections. If it works – and DSL is in the very, very early stages of development – it could make a major dent in problems such as spam and denial of service attacks.

Eugene Kaspersky, CEO of Kaspersky Lab, would like to take things even further. In October, he argued that the internet’s biggest weakness was anonymity, and that everyone should have online passports. “I’d like to change the design of the internet by introducing regulation – internet passports, internet police and international agreement – about following [web] standards,” he told ZDNet Asia.

Kaspersky explained further on the Viruslist.com blog: “When I say ‘no anonymity’, I mean only ‘no anonymity for security control’,” he writes, explaining that he couldn’t care less what people posted on blogs or downloaded through BitTorrent. “The only [requirement] – you must present your ID to your internet provider when you connect.” Kaspersky argues that such requirements are inevitable, with some EU countries already introducing digital IDs. “Another prototype of e-passports is the two-factor authentication we use to access corporate networks,” he says. “The only thing missing today is a common standard.”

Security guru Bruce Schneier isn’t convinced. “Mandating universal identity and attribution is the wrong goal,” he writes on Techtarget. “Accept that there will always be anonymous speech on the internet. Accept that you’ll never truly know where a packet came from. Work on the problems you can solve: software that’s secure in the face of whatever packet it receives, identification systems that are secure enough in the face of the risks. We can do far better at these things than we’re doing, and they’ll do more to improve security than trying to fix insoluble problems.”

The quest for improved security is attracting a lot of attention – and a lot of money. The US Defense Advanced Research Projects Agency (DARPA) awarded contracts worth $56million in January to two firms as part of its National Cyber Range security programme, which will enable network infrastructure experiments, new cyber testing capabilities and realistic testing of network technology. A month previously, Raytheon BBN Technologies was awarded an $81million contract by the Army Research Laboratory to build the largest communications lab in the US, again to research network security.

David Emm is part of Kaspersky Lab’s Global Research and Analysis Team. “It would be unrealistic to expect a wholesale re-architecture of the internet, or even of some of the technologies that are used online,” he says. “If we fix the problem by removing the facility, we run the risk of damaging legitimate activity too.”

There’s also the issue of displacement: if the internet becomes tougher to compromise, villains will simply switch to social engineering instead. As Emm points out, corporate email filtering to remove attached ‘.exe’ files simply spawned the use of links rather than attachments to spread viruses and other malware. “There has always been a human dimension to PC attacks,” he says. “Patching code is fairly straightforward once you know what you need to fix. But patching humans takes longer and requires ongoing investment.”

The last mile

There’s another big piece of architecture that needs upgrading: the bit between your ISP and you. Whether that’s a wired connection or a wireless one, today’s technology needs a serious speed boost. As Tim Johnson of broadband analyst Point Topic explains, “ Over the past 15 years or so we’ve seen the data speeds that typical home users get going up roughly 10 times every five years. I think that will continue over the next decade so that by 2020 many users will be getting a gigabit on their home broadband.

BT’s 21CN project is a software-driven network that aims to drive innovation.

“The big barriers that must be overcome to get there are (a) extending fibre all the way to the home, and (b) providing the backhaul capacity and the interconnect standards to make it useful,” he elaborates. “Both of those are do-able but I think it will be quite late in the teens before they are achieved.”

Johnson reckons that things will get particularly interesting when 100Mbps+ connections are the norm, as they will be able to deliver immersive, high-definition environments and “a huge new space of technology, applications and lifestyle possibilities”. But he’s not convinced the internet can even handle that – not in its current form, anyway.

“This kind of application is rather different from what the internet was designed for and is good at,” he says. “From an engineering point of view it will mean provisioning capacity that will allow users to set up assured end-to-end symmetrical calls of at least 20Mbps each way. There also needs to be a huge amount of standards development and investment to support setup and switching. […] It’s possible that this could all be done across the open internet, but my own belief is that as this type of traffic grows it will create the need for more dedicated capacity. IP and intelligent multiplexing will still rule, but the basic architecture will be different.”

Going mobile

In developed countries, the internet is moving away from the desktop and onto mobile phones and other wireless devices, while in developing countries the internet is primarily a mobile medium already. In both developed and developing countries the number of mobile internet users will increase dramatically in the next decade. So if you think the mobile networks are creaky now, things could get considerably worse in a decade.

For the mobile internet at least, the future may look an awful lot like the past. As Jon Crowcroft of the University of Cambridge writes: “We are so used to networks that are ‘always there’ – so-called infrastructural networks such as the phone system, the internet, the cellular networks (GSM, CDMA, 3G) – and so on that we forget that once upon a time (why, only in the 1970s) computer communications were fraught with problems of reliability, and challenged by very high cost or availability of connectivity and capacity.”

Noting that technologies such as email coped fine in those conditions, Crowcroft suggests that, “It appears that it’s worth revisiting these ideas for a variety of reasons: it looks like we cannot afford to build a Solar System-wide internet just yet, [but] it looks like one can build effective end-to-end mobile applications out of wireless communication opportunities that arise out of infrequent and short contacts between devices carried by people in close proximity, and then wait until these people move on geographically to the next hop. It’s interesting to speculate that these systems may actually have much higher potential capacity than infrastructural wireless access networks, although they present other challenges (notably higher delay).”

Such systems – variously called Intermittent, Opportunistic or Delay Tolerant networks – have a wide range of applications. They’re useful in emergencies and in areas where there isn’t an existing network infrastructure, and they’re particularly well suited to emerging applications where a constant signal can’t be guaranteed, such as internet-enabled cars.

While such networks could ultimately be deployed in remote areas, for most of us the future of the mobile internet is very similar to what we’ve already got. LTE (Long Term Evolution) is a kind of 3G network with knobs on, and in the UK at least it’s generating much more interest than the rival WiMax technology. When LTE begins to roll out later this year it will deliver theoretical speeds of up to 140Mbps, rising to 340Mbps after a 2011 upgrade. An even faster version of the network, LTE Advanced, is in the works. It’s worth noting, though, that even the first version of the LTE network will take several years to roll out nationwide.

And WiMax? In February this year, Patrick Plas – Alcatel-Lucent’s Chief Operating Officer for Wireless – told reporters that the company “is not putting a lot of effort into this technology any longer” as mobile networks were showing “a clear direction taken by the industry towards LTE”. That’s an honest indication of where the mobile internet is heading.

Looking ahead

Predicting the future is a tricky business, and predicting the future of the internet is doubly so. However, it’s clear that the next decade will see some dramatic changes in the way the web works. Some changes are definite – the move to IPv6 will happen, albeit more slowly than many would like – while other developments such as opportunistic networks may never become mainstream.

What we can predict is that the internet of 2020 will be coping with user numbers and traffic volumes that we can barely imagine. To be able to cope with that, the net will probably become a hybrid: a mix of old and new. As Falk puts it: “Recent interest in ‘clean slate’ network architectures encourages researchers to consider how the internet might be designed differently if, say, we knew then what we know now about how it will be used,” he says. “But that is not to say we must discard the current internet to fix the problems. The internet has tremendous value, has supported astronomical growth and changed the lives of millions of people. I believe research in new internet designs will provide insights on where the high-leverage points are on the current design thus allowing us to understand, justify, and deploy changes that will bring the greatest benefit.”

Apr 15

If you think you have the skills to match Graveyard Shift Supervisor with the Las Vegas Police Department Catherine Willows then read on.

The super-sleuth detectives in TV show CSI have some very nifty tools to help solve crimes. But the need to keep things interesting and wrap the show up in an hour means the technology used in each episode bears little resemblance to the work of real forensic experts. Or does it? When it comes to computer forensics, today’s tools are becoming more advanced, leaving fewer places to hide information. This tension between fact and fiction took on a whole new dimension when Microsoft’s police-only forensic toolkit was leaked on the internet. Reports say that it has more in common with CSI than The Bill.

We’re going to show you how to mimic Microsoft’s offering using open-source software to unlock Windows accounts, investigate suspicious activity, see any file on a Windows disk and even peruse files that others believe have been permanently deleted.

Forensic toolkit

During November 2009, it was announced that someone had leaked Microsoft’s secret crime-fighting software online. Described as a collection of programs linked by a sophisticated script, hackers and other cybercriminals had been dying to get their hands on it for some time. Now it’s reportedly available to anyone brave enough to download and install it.

The Computer Online Forensic Evidence Extractor (or COFEE for short) has been available to police forces since at least summer 2007, and is designed to gather forensic evidence at crime scenes and during raids from the still-running PCs of suspects and victims. COFEE reportedly takes the average police officer about 10 minutes to master, and comes supplied on a bootable USB pen drive. It enables trained officers to gather evidence from a running system without the need to call in cybercrime specialists, thereby speeding up investigations.

The USB drive itself is said to contain a package of about 150 forensic programs that enable an investigator to record sensitive information like internet history files and complete practical tasks like deleting Windows passwords. It also enables them to upload the recorded data for further analysis. By April 2008, it was reportedly in use by over 2,000 law enforcement officers throughout 15 countries.

At the time of the leak, Microsoft claimed that COFEE was nothing more than a collection of commercially available programs brought together in a single handy package, which it makes available free of charge (if hitherto secretly) to help combat computer crime.

If that’s true, then is it also possible to create your own version of COFEE using free, open- source software that will grant you complete access to a Windows computer? The answer is a resounding yes, but we must stress that using what you’re about to learn for malicious purposes on a computer you don’t own isn’t big and it’s certainly not clever.

Don’t use the following information to try to hack other people’s computers or networks. Without the in-depth knowledge required to cover your tracks, you’ll be caught and will probably face prosecution. If you hack computer systems in the US and get caught, you should be prepared to undergo a one-sided extradition process and go through a judicial system that will put you on a par with hardened terrorists before forcing you to serve a long prison sentence. There are plenty of commercial computer forensics systems around these days, but many of them cost serious money or are only available to the police. However, the open source community has a solution in the form of a special Linux distribution called Backtrack 4.

Introducing Backtrack 4

Backtrack 4 is based on a stripped-down version of Ubuntu Linux, which is a popular choice for home users because of its ease of installation and use. The makers of Backtrack 4 have stacked the application with special security and forensics tools. These make it extremely useful to network security specialists and police forces, as well as anyone interested in knowing exactly what’s happening on their own networks and any second-hand machines they’ve bought.

Backtrack contains a formidable array of hacking tools.

Despite being Linux-based, Backtrack will grant you complete access to data stored on computers running any version of Microsoft Windows. That’s because Windows isn’t running when Backtrack is booted from a DVD or USB pen drive. Linux can read Windows disks, but it doesn’t obey the file permissions, so the machine’s hard disk simply seems to contain a lot of files waiting to be accessed.

As well as booting and running directly from a DVD as a Live CD installation that never installs on your computer, you can also install Backtrack on a hard disk as the only operating system, or next to an existing Windows installation. If you plan to install Backtrack on a USB pen, you’ll need one with a minimum 2GB capacity. This booting option brings Backtrack closer to Microsoft’s COFEE than any other option.

First, you need to download the Backtrack 4 ISO file, which is just under 1.6GB. You can download it from the Backtrack site directly or click the ‘Torrent’ link on the same page. There are multiple sources from which you can leech parts of the file in parallel, so in practice it’s faster to download the ISO as a torrent. Click here to download the Vuze BitTorrent client, after which you can just click the ‘Torrent’ button on the BackTrack site’s download page.

Once the ISO has downloaded, use it to make a bootable DVD. We’ve listed a free and easy-to- use CD/DVD package capable of making bootable disks in the Resources section. When that’s done, test your work by ensuring your BIOS is set to boot from CD/DVD before attempting to boot from your hard disk, then insert the DVD and reboot the PC. Select the option to boot with a screen resolution of 1,024 x 768. When Backtrack has booted, you’ll see a command line. To start a desktop environment, enter the command startx and press [Enter]. After a few seconds, the standard KDE desktop will start.

Don’t be put off by the command line that appears when you first boot up.

Find your way around

Backtrack is loaded with all the obscure little utilities used by professional security consultants. Many of them are fiddly command-line programs, but a lot have graphical front ends that make them simple to use.

Hover your mouse over the icons on the menu bar at the bottom of the desktop and KDE will tell you the name of each one. We’ll use the names that appear when you do this to make thing easy to identify here.

The network interface cards are designed for network security work, and are disabled by default when you boot up Backtrack. This is because if anyone (or anything) is listening to network traffic, the last thing you want to do is announce your presence by requesting an IP address over DHCP.

To enable networking, click the black Konsole icon to open a terminal window, then enter the following command: /etc/init.d/networking start. After a moment or two, during which lots of verbiage scrolls up the screen, open Firefox (the icon is next to the terminal on the menu bar) and enter www.google.com as a URL. You should see the world’s favourite search engine appear.

Much like the Start button in Windows, the left-hand icon on the menu bar brings up the installed programs and system configuration options. This is called the K menu and is organised into subject areas. The one we’re most interested in is the first: ‘Backtrack’. Click on this and you’ll see a submenu containing categories of hacking programs, with which Backtrack has been preloaded. Clicking one of these reveals nested subcategories right down to individual programs.

Map the neighbourhood

Let’s begin by scanning the local network for hosts (another name for networked computers). Starting from the K menu, select ‘Backtrack | Network Mapping | Identify Live Hosts | Autoscan’. A wizard will appear. Click ‘Forward’ and you’ll be asked for the name of a network to scan. Leave this as ‘Local network’ and click ‘Forward’ again. The next screen asks where the network is located. We’re scanning the local network, so accept the default of it being connected to your computer by clicking ‘Forward’ once more.

Next, select the default network adaptor. This will usually be called ‘eth0’. If you don’t see any adaptors in the pulldown menu, it’s because you didn’t start networking earlier. Close Autoscan, start networking and run Autoscan again. Click ‘Forward’ one last time to confirm what you’ve asked Autoscan to do, then maximise the user interface that appears so you can see everything. Autoscan now contacts every possible IP address on the local subnet to see if there’s a machine connected to it. If there is, it adds an entry to the left-hand pane. Notice that in some cases, Autoscan can even tell you the username that’s logged in.

When you select a host, Autoscan will attempt to gain more information about it for you. A wizard will also appear, asking you to add it to the Autoscan online database. Cancel this. You can go between tabs between the interface’s right-hand panes to display a summary of the machine, detailed information or an inventory.

Autoscan works by sending a stream of specially crafted packets to each host in turn. These are designed to return information about the running system and can give away a surprising amount of information. Autoscan is a useful tool for detecting whether your neighbours are leeching your Wi-Fi, for example. If you don’t recognise a host, it’s probably an intruder – so up your security!

Wipe passwords

Logging into a Windows system is easy using Backtrack, even if you don’t know any of the usernames or passwords that have been set up. That’s because you can use a utility bundled with Backtrack to remove the password on any Windows account, including administrator accounts. This is possible because of a file called the SAM (Security Access Manager), which is normally locked by the Windows kernel so that no one else can read it. This is modifiable while Windows isn’t running.

First, we need to find out where the system’s hard disk resides in Linux. To do this, click the Konqueror icon on the desktop menu bar. This will open the Konqueror desktop browser. Click the ‘Storage media’ link. If you don’t see anything right away, press [F5] to refresh the view. Among the media that Backtrack knows about on your system, you’ll see your hard disk. Click this and you’ll see the folders in C:\, which is useful if you need to copy, add or modify files without logging into Windows directly.

Now select the Home icon on the Konqueror toolbar (the one that’s shaped like a house) and click the blue ‘up’ arrow next to it. Click the Media folder, and then the ‘Hard disk’ icon again. The location bar will change to give the name we must use to access the disk. It’ll be something like ‘/media/disk’.

Now, from the Start menu, select ‘Backtrack | Privilege Escalation | Password Attacks | Chntpw’. ‘Chntpw’ stands for ‘Change NT Passwords’ and it works on all versions of Windows. When you run the command, a terminal window opens. You can ignore the verbiage on the screen and enter the following command: chntpw -i /media/disk/Windows/System32/config/SAM. The capitalisations are very important here – ‘chntpw’ is all lowercase. If your Windows partition is called something other than ‘disk’, put its name in place of this in the command. Press [Enter] and a text-based menu will appear. Select ‘Option one’ and press [Enter] again. This gives you a list of the Windows user accounts. Type the name of the account you want to change (taking care to use the correct case for each letter) and then press [Enter].

Using the Chntpw utility to wipe a user’s password enables you to log into that account unhindered.

Chntpw displays lots of details about the account and gives you a number of options. Select ‘Option one’ and the password will be removed from the account. To exit, type ! and press [Enter], then press [Q] and hit [Enter] again. Chntpw will ask if you want to write the hive files. You do, so press [Y] followed by [Enter].

If you now reboot into Windows, you’ll be able to log into the account you’ve changed without being prompted to enter a password.

Recovering deleted files

Many people believe that when they delete a file and then empty the Recycle Bin, it’s gone for good – but this isn’t the case. Windows, like all modern domestic OSes, simply marks the sectors on the disk occupied by the deleted file as available for future reuse. It would be inefficient to overwrite the data those sectors contain until new data is ready to be stored. In the meantime, the old file is still there, available to be read by anyone with access to a file recovery utility.

Backtrack contains several such applications. Among the easier to use is PhotoRec, which is capable of scanning a hard disk and recovering a comprehensive list of all files marked as deleted. In fact, it can recover far more than just files deleted by users, including temporary files left over from when the operating system was installed. This means it’s a good idea to have a spare USB pen drive handy to store the recovered files for later perusal, because they can easily run into the thousands. To get going, insert the drive and run Konqueror. Click ‘Storage media’ and then select your USB pen drive to ensure that Backtrack is aware of it. You can leave Konqueror open and check the scan’s progress later.

Now run PhotoRec by navigating to ‘Backtrack | Digital Forensics | Forensic Analysis’ and then selecting ‘PhotoRec’.

The program itself runs on the command line, but it’s menu-driven, making it easier to use. When PhotoRec runs, it first presents you with a list of the hard disk partitions on the computer. In the case of a Windows-only machine, there’ll probably be only one large one. However, in some Windows 7 installations, there may be a second, small partition that the system uses to store recovery data. Use the up and down arrow keys to select the main partition, then press [Enter] to continue.

PhotoRec can understand a large number of partition table types and will automatically identify the one used on your disk, so accept the default on the next screen by pressing [Enter] again.

The next screen enables you to specify the file types to recover. Use the left and right arrow keys to highlight ‘File Opt’ at the bottom of the screen. Next, press [Enter]. The resultant display will give you a long list of all the recognised types. If you only want to recover one file type (JPG, for example), press [S] to deselect everything, then scroll down to the relevant type and press [Space]. You can use the [Page up] and [Page down] keys to navigate through the list more quickly.

 

Once you’re happy with your file type selections, press [Enter] and select the filesystem you want to scan. Use the left and right arrow keys to select the ‘Search’ option, then press [Enter]. This presents you with a choice of filesystem types. For a Windows filesystem, make sure you select ‘Other’, then press [Enter]. On the next screen, select ‘Free’ to ensure that the program only scans disk sectors that are marked as free space. Press [Enter] again to continue.

You’ll now be asked where to store the recovered files. The default is the directory ‘/usr/local/bin’, which is on the boot media. Press the left arrow key three times to get back to the root directory, then press the down arrow key repeatedly to navigate to the media directory. When you reach it, press [Enter] to see the media connected to the system. One of the devices you find should be the USB pen drive you inserted and navigated to in Konqueror just a moment ago. Select this and press [Enter] again. Finally, press [Y] to begin recovering deleted files.

The extraction process can take quite a while, depending on how much free space there is to scan on the disk and the number of file types you’ve specified. As the scan progresses, the number of files of each type will increase. PhotoRec creates a long list of subfolders in which it stores all the files it’s recovered. By perusing these, you may be able to locate some interesting or even incriminating pictures and other documents.