May 11

When we need to select an algorithm for a particular purpose, we should pay attention to its runtime characteristics: how fast it is; how much memory it uses; whether there’s a worst case for the algorithm’s execution speed; and so on. All these answers are expressed with the big-Oh notation, which I’ll describe later.

A common abstract data structure that’s used all the time in programming is the dictionary or associative array, which is sometimes known as a map. I call it an abstract structure because it can be implemented in myriad different ways, but it always has a specific interface. We’ll use the dictionary to investigate the runtime efficiency of various algorithms that can be used to implement it.

OK, so it’s not this kind of dictionary. We’re referring to a digital one. An associative array.

But first, a definition: a dictionary is a structure that holds name-value pairs. A name-value pair is an object that has a name – that’s used both to describe its value and as a key to find it – and a value, which can be anything at all. The classic example is a real-world dictionary, where the name is a word and its value is the word’s definition. However, don’t limit yourself to assuming the name is always some kind of text string. In reality, names can be integer values, bit strings, 128-bit GUIDs, dates or anything at all. That said, it’s helpful to assume that they’re text strings for now.

The dictionary has various operations that define its external interface. There’s the ‘Create’ operation, which creates a new dictionary, and the ‘Destroy’ operation, which releases any resources the dictionary is using and destroys the structure. A dictionary can only be used after ‘Create’ has been called, and once ‘Destroy’ is executed, it no longer exists. Since these operations are only used once each per dictionary, they won’t have much effect on the overall runtime and so we won’t discuss them any further.

When given a name, ‘Find’ will search for the name-value pair that matches and return its value or an error if the name is not found. ‘Exists’ will do the same, except it will merely return true or false according to whether the name is present or not. Since they’re virtually identical, apart from what they return, we’ll ignore ‘Find’ from now on.

Finally we have ‘Insert’ and ‘Delete’, which do what you’d expect: add a new name-value pair to the dictionary (returning an error if the name already exists), and remove the name-value pair that matches a given name, respectively. In general, ‘Delete’ won’t return an error if the name is not found, and sometimes ‘Insert’ will merely replace the value if the name already exists.

Now that we have our abstract data structure, let’s investigate first how to implement it and second analyse the efficiency of our implementations. We’ll look at a total of four implementations.

Name-value pairs

The first implementation is the most obvious: use an array of name-value pairs. ‘Exists’ is the first operation to think about. In essence, to see whether the given name is present, you would check every pair in the dictionary sequentially and stop when you found it. If the given name isn’t present, you would compare the name of every name-value pair to the given name. The more pairs there are, the longer it would take, but you can be even more precise than that. Suppose there were N pairs in the dictionary and each comparison took the same (constant) length of time – say t. Then it would take tN time units to find out the given name wasn’t present. Another way of putting this is that the time taken for the nonexistence check is proportional to N. In computer science, without going into too much rigorous mathematics, we say the runtime efficiency is O(N), pronounced ‘big-Oh of N’, although you can read it as ‘is proportional to N’.

So if it took so many seconds to find out that a given name wasn’t in a dictionary of 1,000 pairs, it would take twice as long for a dictionary of 2,000 pairs, and 10 times as long for a dictionary of 10,000 pairs.

What if the given name was in the dictionary? What could we say then? Well, it could be that the matching pair was the first item checked. In that scenario, we say the best case efficiency for ‘Exists’ is O(1), which you read as ‘is constant’ (in other words, it doesn’t depend at all on the number of items in the dictionary). But, of course, for that to happen, you’d have to be extremely lucky. You could be completely unlucky and be looking for the final item. Here the worst case efficiency is O(N) – the time taken would be proportional to the number of items in the dictionary.

On average, though, if you searched for every name in the dictionary, the efficiency would be O(N/2). Now comes the fun bit with big-Oh notation: since it essentially means ‘is proportional to’, you can take the 1/2 (a constant) out of the parentheses into the implied proportionality constant and say that the efficiency is O(N). We say that searching through the dictionary-as-array is O(N): twice as many items, twice as long.

‘Insert’ is simple: we add the new name-value pair to the end of the array, a constant O(1) operation. Hold on there though – we first have to search the array to find out if the name is already present or not. ‘Insert’ then degenerates to O(N), just like ‘Exists’. We get no benefits at all from the constant, quick, add-it-to-the-end operation; we still have to search.

‘Delete’, as I’m sure you can see, is at least O(N) as well – we have to do the search. There’s something else about ‘Delete’ that we have to take into account: we have to physically remove the name-value pair from the array. The simplest way of doing this is to simply take the final pair in the array and put it in the slot vacated by the pair that was removed: a constant O(1) operation. So, overall, ‘Delete’ is O(N); the search time will swamp the move-an-item time.

Sorted pairs

Let’s move on to the second implementation. This one is again an array, except this time we maintain the pairs in sorted order. This has the assumed requirement that the names are sortable and that, given any two unequal names, we can say that the first is smaller or greater than the second.

We’ll start off by analysing ‘Exists’ again. The array is in sorted order, so we can use binary search to try and find the name-value pair that matches. With binary search, we look at the middle item in the array. If it’s the one we want, we stop. If the one we want is less than this middle item, we know that, if it’s present at all, it’ll be in the first half of the array. If the one we want is greater than the middle item, we know it will be in the second half. We repeat this process with the half array we selected. We’ll either find the item immediately again, or we’ll have reduced the number of items we have to search to a quarter of the array. Ditto the next step, except we reduce the space we have to search to an eighth of the original array. And so on.

Again, consider the doesn’t-exist case. Say we start out with an array with 1,023 items. After one step, we’ll have discarded one item and will have identified a subarray of 511 items for the next step. After this next step, we’ll have reduced the search space to 255 items, and so on. At the 10th step we’ll have a tiny array of just one item, which we can easily compare. So all in all, we’ll have made 10 comparisons to find out that the given name is not present. What’s so special about 10? Well, it’s the logarithm to base two of 1024 (that is, 2ˆ10 = 1024). Again, without being too rigorous mathematically, we say ‘Exists’ is O(logN) when the name isn’t present.

Think of O(logN) this way: if it takes a particular length of time to find out that a given name isn’t present in a sorted array of 1,000 items, it will only take twice as long for an array of 1,000,000 items. If you square the number of items, you double the time taken. This is an extremely significant result, showing the importance of binary search. What if the given name is present? We can make the same analysis as before: best case is O(1), worst case is going to be the same as not finding it: O(logN), and so we say that, overall, ‘Exists’ is O(logN).

What about ‘Insert’ and ‘Delete’? Again, we have to search for the name, so it would seem that they’re both O(logN). But this time, consider what we must do to add (or remove) the name-value pair. For ‘Insert’, we have to make a hole in the array to put the new pair in, shuffling all the items greater than it along by one. For ‘Delete’, we have to shuffle the remaining pairs to close up the hole vacated by the removed pair. If we’re lucky, in both cases, we don’t have to move any items (that is, best case is O(1)); if we’re unlucky we have to move all of the remaining pairs (that is, worst case is O(N)). On average, it’s O(N) for all the shuffling we need to do. Since O(N) is bigger than O(logN) – for very large values of N the (in)efficiency of the moving of the items will swamp the efficiency of the search – we ignore the smaller proportionality and just use the larger one. We say ‘Insert’ and ‘Delete’ are both O(N).

Hash table

Now for the next implementation: the hash table. Without going into full detail, we have an array as the basic data structure. Again, we analyse ‘Exists’ first. To find an item in a hash table, we hash the given name to produce an index into the array. The hash is produced by a randomising type function that takes the name, chops it up and combines the parts to produce an integer value. That integer value is then reduced to a possible array index value by use of the mod operator. The hash function is designed so that similar names produce very different hash values.

Best case is that ‘Exists’ is O(1). That is, we create the hash for the given name, convert it to an index, go to that element in the array, and the pair we need is there and matches. No matter how many items are in the array, that process is constant. (Actually, the hash function is usually O(k) where k is the length of the name, but we’re ignoring that for now.)

What about worst case? Well, in practice we’ll find that many names will hash to the same array index value. These are called collisions and we need to implement a collision resolution strategy to deal with them. The simplest is known as chaining, where we chain the name-value pairs as, say, a linked list at each array element. In this case, once we’ve calculated the index, we then do a sequential search through the chain at that index.

To ensure that the chain is never too long, hash tables grow themselves periodically when their load factor (the number of pairs present divided by the number of array elements) reaches a particular value. To do this, a new array is created, and all the pairs are rehashed and inserted into the new array. This ensures that chains never grow beyond a few items, say five or 10. Since this isn’t dependent on the total number of items, it’s still constant and we say ‘Exists’ in a hash table is O(1) on average.

‘Insert’ is a more difficult operation to analyse. On the face of it, it’s O(1) – both the ‘Search’ and ‘Add’ functions are constant time operations in general – but every now and then, a reorganisation will take place on an insertion operation. In general, hash tables are written such that they double in size when they grow. This is a O(N) operation, but we can amortise it over all previous insertion operations, so that, overall, ‘Insert’ remains O(1). Best case then is O(1), worst case is O(N), amortised case is O(1).

The same types of arguments can be made about ‘Delete’, although in general we tend not to shrink a hash table anywhere near as often as we make it bigger. ‘Delete’ is then O(1), meaning that the amortised use of a hash table over all its operations is O(1). There is, of course, still that warning that every now and then you will hit the O(N) worst case on an insertion.

Binary tree

The next data structure we can use is a balanced binary search tree, such as a red-black tree. This, like the sorted array version, makes the assumption that names can be sorted.

In a binary tree, the efficiency of search operations is O(d), where d is the maximum depth of the tree (the number of levels from the root of the tree to the furthest leaf). Since a perfectly balanced binary search tree is equivalent to binary search on a sorted array (every link you decide to follow will enable you to ignore a whole chunk of the tree), ‘Exists’ is on average O(logN). Best case is still O(1), but what about worst case? That depends on the algorithm used to balance the binary tree. Balancing is never perfect but, using red-black trees as an example, we can prove that they’re constructed such that the longest path is a maximum of twice the length of the shortest path. If you like, O(2logN). Since 2 is a constant, we can take it out, making red-black trees O(logN) in the worst case for ‘Exists’.

For ‘Insert’ and ‘Delete’, there’s a lot of mathematics that can prove that they’re both O(logN) as well. In essence, the search is O(logN), and the addition of the new node or removal of the old node is O(1) on average.

So, overall, a red-black tree is O(logN) in all its operations. Perhaps more importantly, it has guaranteed O(logN) time even in the worst case. This means that some people will prefer to use a red-black tree for their dictionary instead of a hash table because they don’t want to hit the possibility of O(N) insertion.

Figure 1: Graphing some common big-Oh expressions (O(N^2) is cut off so we can see the others).

From this discussion, you should now have a basic understanding of how to read and understand big-Oh expressions and how to evaluate algorithms and data structures based on them. Figure 1, above, illustrates the runtime for various common big-Oh expressions.

Radix trees

Radix trees offer a further data structure that can be used for a dictionary. A radix tree stores prefixes to keys rather than complete keys in its nodes, and each node can have many children. A key is then found as a complete path through the tree from root to leaf – at each step down the tree, you compare another small part of the name to the next node.

Figure 2, below, shows an example radix tree storing a small set of words. In searching for ‘hostess’, we follow the left link from the root, matching host, then follow the middle link matching the ‘e’ and finally matching the ‘ss’ in the right node.
Unlike the other data structures we’ve looked at, the efficiency of a radix tree doesn’t depend on the number of name-value pairs, but instead on the length of the keys. All operations are essentially O(k), where k is the maximum name length in the radix tree. This can be greater than the balanced binary tree’s O(logN), for example, but in practice we find that the comparisons needed in a binary tree are also significant, so the radix tree can be a viable alternative.

Figure 2: A small radix tree, using middle dot to indicate end of word.

Ternary trees

Back in issue 282, I cited ternary search trees as a strong candidate for the data structure behind a dictionary. Ternary trees, like radix trees, have a runtime efficiency that’s dictated by the length of the keys rather than their number, but are much easier to implement. Ternary search trees and radix trees also have a further benefit: using them means you can easily produce a sorted list of names in the dictionary, as well as produce a prefix list (a list of names with a particular prefix).

Profiling

All of the efficiency results quoted in this article are theoretical. They are all of the form ‘for large values of N the efficiency is proportional to some expression in N’, but make no mention of the size of the constant of proportionality. Therefore, when deciding on which data structure to use in your dictionary, you should profile actual code running on your actual data. It’s pointless worrying, for example, about the efficiency of millions of items in a dictionary when you’ll only have 100.

May 06

The run up to this election has seen politicians promoting themselves more via Twitter feeds or Facebook groups than by kissing babies in the street. Clearly this shows us technophiles that today’s politicians are embracing the internet age – or does it? How much of this is just posturing? Did the empty seats in the House of Commons during the debate over the Digital Economy Act indicate that most MPs didn’t understand the Act’s impact, or that they just didn’t care? We decided to find out just what technology policies the different parties are offering, and interviewed the people who will be writing the tech manifesto for their parties if they win the election. There are some impressive claims being bandied about.

Parliament could have a very different look depending on your vote. (Parliamentary images reproduced with permission of Parliament)

Labour has been very vocal about its technology policies, not least its Digital Economy Act. This gives the government the right to block sites infringing copyright and ban downloaders from accessing the internet. They are also championing rolling out high-speed internet (well, 2Mbps) for everyone by 2012 and “superfast” 100Mbps broadband for 90 per cent of the population by 2017.

The Conservatives also want super-fast broadband for most of the country (which would be achieved by opening up BT’s infrastructure to other companies) and would reduce the corporation tax rate to encourage new technology businesses to set up in the UK.

The Liberal Democrats were vociferous in their opposition to the Digital Economy Act, and have plans to uphold net neutrality and overhaul copyright law. They too want high-speed broadband for all.

So just what are the most important tech policies for each party? Read on to see.

Conservative party

Jeremy Hunt, Shadow Secretary for Culture, Media and Sport told us: “Our key policy to promote the technology industry is to ensure that Britain has a modern, fast broadband infrastructure. We will deregulate the market and force BT to give access to its underground ducts and overhead telegraph poles to rival ISPs. This will allow ISPs to lay their own fibre at a lower cost, and a super-fast broadband-supporting fibre network will be established over much larger parts of the UK. Funding, where needed, would come from the Digital Switchover segment of the licence fee.”

What about the technological economy? “It’s vital that we encourage technology companies to set up in the UK. We’ll cut the headline rate of corporation tax to 25p or lower and the small companies’ rate to 20p, funded by reducing complex allowances.”

What in the Tories’ opinion have Labour got wrong? “An over reliance on massive-scale IT projects that have gone over budget and not been delivered on time. We will create a level playing field for open-source IT in government procurement and open up government IT contracts to [smaller companies] by breaking up large IT projects into smaller components.”

  • Super-fast broadband for all: Will deregulate the market and open up BT’s infrastructure to competitors. Paid for using the Digital Switchover section of the BBC licence fee.
  • Right to data: Statistics like street-by-street crime levels and power consumption of Government buildings to be put online.
  • Government to use open-source IT: Cost of large-scale IT projects would be reduced.
  • Cap government IT projects at £100m: Would let smaller IT companies help out.
  • Corporation tax reduction: Corporation tax rate reduced to 25p to attract tech companies.

Labour Party

Ben Bradshaw, Secretary of State for Culture, Media and Sport told us: “Labour wants Britain to be the world leader in the digital economy. We will create over 250,000 skilled jobs by 2020 and [become] the world leader in public service delivery. “We will ensure universal access to today’s broadband services at 2Mbps by 2012 – this will be delivered through upgrades to the existing networks and be supported with public funding including the underspend from the Digital Switchover Help scheme.”

“The Digital Economy Act is a key part of our active industrial strategy, helping us maintain and build on the digital economy. It ensures a competitive digital communications infrastructure [and] protects intellectual property. The Conservatives offered no practical solutions on [either] of these.”

Labour also has plans for a new technology institute: “The Institute of Web Science will be based in Britain and will work with government and business to realise the social and economic benefits of technological advances. It will assemble the best of the world’s scientists and researchers and be headed by Sir Tim Berners-Lee and the leading web science expert Professor Nigel Shadbolt.”

  • High-speed broadband: 2Mbps for all by 2012, paid for by a fixed telephone line levy of 50p a month. 100Mbps for 90 per cent of the country by 2017.
  • Government to use cloud computing: Would save £3.2billion annually
  • Digital Economy Act: Passed to provide a competitive digital communications infrastructure and protect intellectual property.
  • Home Access scheme: Reduce the number of non-internet users by 60 per cent by 2014.
  • Will increase scope of data.gov.uk: More previously private government data online.

Liberal Democrats

We asked Don Foster, Shadow Secretary of State for Culture, Media and Sport, which Lib Dem policies would interest a PC Plus reader. “Our policies on broadband roll-out and extending IT skills, and the work we’ve done on the Digital Economy Bill,” he answered. “The more people who have the skills and access to make use of technology, the more useful it becomes.

“We do not believe that the country’s broadband infrastructure can be left solely to market forces, which is why we advocate an outside-in use of public funds to begin delivering broadband from day one to rural areas. The market will deliver the infrastructure in urban areas.”

How would the Liberal Democrats attract tech companies to the UK? “We support the proposed tax break for the video games industry and we will also tackle the growing burden of red tape, which continues to cost businesses increasing amounts of time.”

What was the current Government’s biggest technology mistake? “Not prioritising the Digital Economy Bill in debate. The Bill was an opportunity to ensure that everyone is able to take advantage of the opportunities presented by the internet. The government’s unwillingness to give the Bill the time necessary for proper parliamentary scrutiny shows how low in their priorities it sat.”

  • Superfast broadband for all: Would use public funds to get broadband for everybody available straight away.
  • Support for tech business: Tax breaks for video-games companies and high-tech industries to encourage growth.
  • Overhaul copyright law: Update laws to reflect the technology of modern-day society.
  • Support for net neutrality: Would strongly oppose blocking of internet sites.

Small parties with big tech policies

What do the smaller parties contending for parliamentary seats offer to people passionate about the web, computers and technology? We spoke to five and found out. We also asked which of the policies that the big three parties were proposing could, in their opinion, damage the internet and technological development.
During our fact-finding mission we found much of interest. The Communist Party want to make tech-literacy part of the educational process as they feel computers are no longer a luxury, but a necessity. The Green party think broadband should be a fixed-rate service available to all. The Pirate Party UK are all about free speech – and not just on the internet. They want to ensure that personal privacy is a priority of government and industry. Plaid Cymru want an ultra-fast national broadband network and the SNP aren’t sure the Digital Economy Act goes far enough. The last word goes to the Monster Raving Loony Party, who, when contacted said: “What on earth are you talking about? Don’t be so serious.”

Ben Stevenson, National Secretary, Communist Party

www.communist-party.org.uk

“There are IT-related issues important to us. We oppose the Digital Economy Act, a sop to international big business. We recognise the need for a far-reaching and proactive approach to ensure all citizens can access all aspects of technology. It’s obvious that the speed of the technological revolution has made basic access and experience in using computers and the internet a necessity. Britain’s Broadband speed currently ranks 17th in the world, 23 times slower than Japan. We need an integrated communications strategy to bridge the gap. Education is essential to ensuring that tech-literacy is considered a vital part of modern life in Britain. IT needs to be given greater prominence in the National Curriculum and to be fully integrated into all its aspects.”

Andy Robinson, Party Leader, Pirate Party

www.pirateparty.org.uk

“We don’t just know what technology is, we know how it works and how it has affected our society. In the modern age, existing copyright and patent laws do not make sense. Our policy is to shorten the duration of copyright to five years, and to allow the sharing of copyrighted material provided that no profit comes of it.

The Digital Economy Act is a terrible piece of legislation. It legitimises corporate spying on individuals, forces ISPs to throttle or even ‘suspend’ connections based only on allegations of infringement and allows copyright holders to demand ISPs censor websites on the flimsiest of evidence. PPUK is unquestionably against these policies.”

Caroline Lucas, Party Leader, Green Party

www.greenparty.org.uk

“The Green Party believes that the development of computer communications has reached the point where BT should have an obligation to provide broadband-capable infrastructure to every household. Funding for marginal ‘uneconomic’ lines may come from a small levy on every access line. The principle of universal access at the same base price to the household should prevail.

Many of us believe that [the Digital Economy Bill] threatens to infringe fundamental human rights through the disconnection of internet accounts and the new ‘website blocking’ laws could result in new ways to suppress free speech and legitimate activity.”

Lowri Jackson, Research and Policy, Plaid Cymru

www.plaidcymru.org

“Connecting Wales to the world digitally will encourage innovation and job creation. We believe that new technology must be harnessed to provide Wales with a strong voice on the global stage and to ensure that there are no communication ‘not spots’. We call for research into the construction of a super-fast national broadband network. We also support compulsory network sharing between mobile phone and broadband operators. Westminster can learn a lot from the National Assembly with its transparent and democratic processes. We’re also concerned about the threat to our civil liberties implied by increased internet monitoring, and will campaign for freedom of the internet.”

Pete Wishart, Culture and Broadcasting, Scottish National Party

www.snp.org

“While the ambition to secure universal broadband access is to be welcomed, more must be done to protect those who are working in our creative economy. Our creative industries contribute significantly to the economy and are a key route to economic recovery, yet protection for artists and creators remains an afterthought. Writing letters to persistent downloaders and threatening slower internet speeds seems a feeble response to the loss of millions of pounds of income to artists and creators. What is needed is a mixture of effective technical measures and creative solutions, but above all the political will to tackle this problem.”


Question time with the Pirate Party

For some political parties technology is the reason they exist, and it seems to be paying off. The Swedish Pirate Party caused a stir last year when it gained two seats in the European Parliament. The UK’s Pirate Party has the same three core platforms: reform copyright and patent law, end ‘excessive’ surveillance of innocent people and ensure ‘real’ freedom of speech. But all these issues have possible downsides. We asked Pirate Party UK’s leader Andy Robinson if its policies would actually work in the real world.

PCP: Surely reforming copyright and patent law will damage British businesses?

Andy Robinson: There are always winners and losers when any law changes. Reforming copyright law will reduce the power of record industry ‘rights-holders’ to dictate what music we get to hear and what we don’t. It will also benefit lesser-known musicians who don’t want to sign away future royalties to get their music heard. Reforming patent laws to fix problems like the ring-fencing of huge areas by overly broad patents will increase competition and reduce red-tape. A better regime would encourage manufacturing and design investment, boosting the economy at no cost to taxpayers.

Did you know it’s illegal to sing Happy birthday in public without paying a fee? The Pirate Party would like to put an end to this.

PCP: Isn’t surveillance central to UK security?

AR: A certain degree of surveillance is necessary, but we urgently need to set sensible limits on it. Vehicles are being tracked: the police’s automatic number plate recognition camera network takes 14 million photos a day. We need rules that say how much is too much, before we sleepwalk into a surveillance state.

PCP: One man’s freedom of speech is another man’s persecution. How do you intend to protect the weaker and less vocal in society from the strongest and loudest?

AR: New media outlets empower many of the people who were previously disempowered to have their say. The best counter to persecution is not censorship, but education. Teaching people to get together and stand up for themselves is far better than short-term measures taken just so politicians can be seen to be doing something. That said, we support current equality legislation banning unfair discrimination and would not change this policy.

PCP: Where do you stand on more prosaic issues like internet speed?

AR: Consumers have been complaining about this for years. We plan a system where payment will be based on the speed the user actually gets, not the advertised headline speed. Of course, we’ll be unable to achieve any of our aims, prosaic or otherwise, without votes or the support of donations through our site (www.ppuk.it/donate).

Apr 27

Electronic Data Capture (Edc) By: KunalHanda


A pharmaceutical company or sponsor may have particular interest; research and academic institute may have another. Whatever may be the case; the major role of clinical data management is collection of clinical trial data and ensures that data is error free, consistent and complete. Data is generated at the clinical trial site and stored in paper form and more recently in EDC (electronic data capture).


The site is usually a hospital/clinic where the patients in the clinical trial are recruited and provided the drug treatment as per a well-defined protocol. There are many guidelines and also laws that outline the governance and conduct of the trial to ensure the safety of the patients involved. The clinical trial data gathered at the investigator site in the case report form (CRF) is transcripted in the clinical data management system (CDMS).


Some of the popular platforms used globally are Oracle Clinical and Clintrial. The EDC platforms wherein the data is entered directly into the system at the site include Inform, Medidata and Oracle Clinical. To reduce the possibility of errors due to human entry, the system employs the double data entry to ensure high quality of data in paper-based trials. Once the data has been screened for typographical errors, the data is validated to check for logical errors. The entered data is cleaned, reviewed, extracted and provided to the biostatisticians for review.


At the end of the clinical trial, the data in the CDMS is analyzed and sent to the regulatory authorities for approval. Electronic Data Capture- The Future Globally, we are observing a shift from the traditional paper-based study to electronic version, commonly termed as EDC ,electronic data capture. The major reason for this can be attributed to the growth in Information Technology (IT) service sector for Life Sciences. EDC has not only made easier to capture data remotely from various sites but also with inbuilt validation and edit checks it has made possible to collect error free data in the very first stage. EDC systems have also made handling of clinical trial data more secure and efficient. The data from the EDC systems can be exported in various formats like CDISC, XML, etc.


With tremendous growth in storage technologies (SAN, NAS), it has become a reality to store terabytes of EDC trail data in a small space securely and in a cost effective way compared to storing large volume of paper documents during traditional paper trials. Earlier the time required to clean trial data was quite high compared to the trial data in EDC. As the volume of information collected in clinical trials continues to grow, data collection and management is becoming a priority for pharmaceutical companies and clinical research organizations. One of the critical component in clinical trials is Capturing data which is more accurate and in time to market for potential new drugs.Electronic data capture (EDC) systems are used in all phases of clinical trials to collect, manage, and report clinical and laboratory data. Electronic data capture (EDC) provides both the tools and the process infrastructure necessary to achieve the needed data quality as well as process scalability.


About the Author

MakroCare, a multidisciplinary knowledge and technology driven clinical research organization (CRO) that offers CDM (paper and Electronic) and EDC Services and many more. It does the data analyses, reporting and submission of trial data to the regulatory agencies in much fast pace thereby saving precious time and cost.
Article Source: http://www.articlesbase.com/