Why computers learn by analyzing
ediscovery data. 

What is AI? The short answer is that AI is short for artificial intelligence, which is the simulation of human intelligence by computers.

Computers using AI can be said to think and learn based on experience and feedback. Instead of only following rules that have been explicitly documented, AI systems can extrapolate from information they’ve analyzed and create new rules. Because no one specifically programmed the AI system on how to solve a problem, it’s often the case that people can’t explain how an AI program is reaching its conclusion.

One of the key characteristics of AI is that its development depends on having a tremendous quantity of high-quality data. For computers to learn to think, they must be able to digest and analyze huge volumes of data or electronically stored information (ESI). Without sufficient data, AI systems learn poorly or not at all.

And that’s good news for ediscovery professionals, because the legal industry is positively drowning in data — too much for mere humans to effectively process. This has led to the rise of technology-assisted review (TAR), which is sometimes referred to as computer-assisted review (CAR).

But TAR isn’t the only application of AI in ediscovery. AI can help lawyers and ediscovery professionals manage data more efficiently and effectively during more than document review — it has applications at every stage of ediscovery..

For instance, AI analytics can apply to ediscovery in the early phases of collecting and preserving ESI through techniques like near deduplice detection and email threading. Near duplicate detection (or near-deduping) doesn’t remove exact copies of emails (as deduplication does), but it groups together minor variants so that they can be assessed together, saving time for human review teams. Similarly, email threading pulls together related messages and presents them in one consolidated conversation to allow for more accurate coding and analysis.

Many AI applications in ediscovery are built around a capability known as natural language processing (NLP). In NLP, computers learn to decipher concepts regardless of what specific language they’re stated in. To take a simple example, friends parting ways might say “goodbye,” “bye,” “see you later,” or “take care.” An NLP program could recognize that all of these expressions mean the same thing.

NLP allows computers to group together similar content using concept clustering or document clustering. As AI software learns to recognize the relevant concepts in an ediscovery matter, it can group together and label data that relates to those concepts. For example, in a personal injury case, concept clustering allows software to separate information about potential liability from information about the extent of the plaintiff’s injuries. Reviewers can then analyze these clusters separately, enhancing their efficiency.

One caution: AI has become a buzzword; not everything that’s referred to as “AI” actually uses artificial intelligence.

Glossary definition

Artificial intelligence (AI) is the computer simulation of human intelligence. AI systems learn by analyzing data and receiving feedback. Common AI applications in ediscovery include technology-assisted review (TAR) and concept clustering based on natural language processing.