Post Process

Everything to do with E-discovery & ESI

Archive for the ‘Search Protocols’ Category

Case Blurb: Creative Pipe; Not all keyword searches are created equal

Posted by rjbiii on June 15, 2008

While it is known that [Producing Party] and [Producing Party’s attorneys] selected the keywords, nothing is known from the affidavits provided to the court regarding their qualifications for designing a search and information retrieval strategy that could be expected to produce an effective and reliable privilege review. As will be discussed, while it is universally acknowledged that keyword searches are useful tools for search and retrieval of ESI, all keyword searches are not created equal; and there is a growing body of literature that highlights the risks associated with conducting an unreliable or inadequate keyword search or relying exclusively on such searches for privilege review. Additionally, the Defendants do not assert that any sampling was done of the text searchable ESI files that were determined not to contain privileged information on the basis of the keyword search to see if the search results were reliable. Common sense suggests that even a properly designed and executed keyword search may prove to be over-inclusive or under-inclusive, resulting in the identification of documents as privileged which are not, and non-privileged which, in fact, are. The only prudent way to test the reliability of the keyword search is to perform some appropriate sampling of the documents determined to be privileged and those determined not to be in order to arrive at a comfort level that the categories are neither over-inclusive nor under-inclusive resulting in the identification of documents as privileged which are not, and non-privileged which, in fact, are. The only prudent way to test the reliability of the keyword search is to perform some appropriate sampling of the documents determined to be privileged and those determined not to be in order to arrive at a comfort level that the categories are neither over-inclusive nor under-inclusive.

Victor Stanley, Inc. v. Creative Pipe, Inc., 2008 WL 2221841 (D.Md. May 29, 2008 )


Posted in 4th Circuit, Best Practices, Case Blurbs, D. Md., Magistrate Judge Paul W. Grimm, Search Protocols | Tagged: , | Leave a Comment »

Google Search gets Personal

Posted by rjbiii on November 30, 2007

Google is adding functionality to its searches that will allow users some input into the ranking and and sorting of results:

Google has rolled out a new option in its Labs-based experimental search program which allows you to rank and re-order search results. The new experiment is reportedly showing up for select users only, but the help page says that the goal is to allow you to “influence your search experience by adding, moving, and removing search results.”

Those of us in EDD are always looking for ways to tweak searches to better fit them to our clients’ needs. It will be interesting to see how easily and effectively users are able to influence the accuracy of these searches, and how soon such technology makes it into review platforms and the like.

Posted in Articles, Search Engine Technology, Search Protocols | Leave a Comment »

E-Discovery Pitfalls: Court dictates collections and search protocols

Posted by rjbiii on November 9, 2007

The latest in our series on e-discovery pitfalls.

K&L Gates has posted an opinion in which U.S. Magistrate Howard R. Lloyd dictates the collection and search protocols of a set of data over which the parties have become somewhat contentious. Let us begin with His Honor’s description of the dispute:

According to defendants, there are two hard drives in question. In July 2007, they reportedly made bit-for-bit copies of those hard drives (including recovered deleted files and fragments) and produced documents responsive to plaintiff’s requests. Plaintiff is skeptical about the production.

Well, the requesting party is always skeptical, isn’t it? What circumstances give merit to plaintiff’s suspicions?

[Plaintiff/Requesting Party] says that, to date, defendant Romi Mayder has produced only one email pertaining to his work at Silicon Test Systems, Inc. whereas Bob Pochowski, a third-party witness, has produced a host of documents (emails, data sheets, and the like) from Mayder that apparently were created during Mayder’s employment at Verigy.

Oops. This illustrates the dangers of working with highly distributable and “copyable” documents, such as e-mail, and not producing a full set (for whatever reason). Even in the days of paper, you never knew where all the copies might have been hiding. In this digital age of ours, with the ease of replication and distribution, the dangers are exponentially higher. So let us remember two things: do a good job on formulating an appropriate search protocol; and, of course, never deliberately exclude relevant documents not subject to privilege from production. But the court isn’t finished with plaintiff’s suspicions.

Verigy also contends that other documents produced to date demonstrate Mayder’s willingness to manipulate evidence. Plaintiff also asserts that, when defendant Mayder left plaintiff’s employ, a system or software upgrade was performed which may have deleted files from defendants’ hard drives.

So now they walk beyond the line of suggesting the producing party could have accidentally failed to produce, but suggest defendant is indifferent with respect to its obligation to produce, or that it even purposefully manipulates data to protect itself. This serves to illustrate the importance of following a defensible, documented collection plan. The documentation may serve to refute allegations of impropriety or mismanagement. The importance of retaining a third party to execute the collection process is also on point, as such an expert tends to lend an objective voice to any dispute over procedure.

Now, this next bit is interesting, and potentially really bad for the defendant.

[Requesting Party] argues that it needs to conduct additional discovery of those hard drives, not only to determine whether any relevant documents have been withheld from defendants’ production, but also to examine what may have happened on the hard drives and why.

The requesting party wants to examine the drives to see if defendants failed in their responsibilities. The request is not made merely for the sake of satisfying their curiosity. The possibility that such intrusive measures might be allowed should be a warning shot over the bow for any party engaged in discovery. Make sure your processes are thorough, managed competently, well documented, and defensible.

[Producing Party does] not dispute that a system or software upgrade was performed which may have deleted files from their hard drives. However, they maintain that all deleted files have been recovered and preserved and that they have produced all information responsive to plaintiff’s requests.

All deleted files have been recovered? That’s far from certain, especially with respect to an operation as extensive as a software upgrade. The percentage of deleted files forensically recovered is based on many factors. Was “wiping” involved? If not, has the drive been defragmented? What is the “data turnover” (number of files deleted vs. number of new files written to the drive) of the drive at issue? Under only a very limited set of circumstances might one be able to say with any semblance of certainty that every single deleted file was recovered. As we see, the judge doesn’t appear convinced either. Upon considering the arguments, the court sets a two-tiered plan into place.

Defendants propose a two-tier protocol which (a) permits discovery in areas that defendants deem presumptively relevant; and (b) allows plaintiff to request that the expert conduct other searches, subject to an opportunity by defendant to review and object to the proposed search requests.

Defendants sought to protect themselves from abuse:

Defendants express concern that plaintiff will propound unduly burdensome or otherwise abusive searches beyond the scope of permissible discovery under Fed.R.Civ.P. 26. At the motion hearing, it was suggested, somewhat facetiously, that Verigy might attempt to request a search for all documents with the letter “A.” Indeed, documents submitted on supplemental briefing indicate that Verigy apparently has previously requested a search for all documents containing the letter “V” (see Pasquinelli Decl., Ex. C)–a request which strikes this court as being patently overbroad.

In an interesting note, the requesting party argued that disclosure of additional search terms it wanted to use might infringe attorney work product. The court, however, was not persuaded.
In concluding its opinion, the court felt the urge to remind counsel and the parties of their duties under the law:

Although it should go without saying, the parties are admonished to proceed in good faith and to refrain from conduct designed to unnecessarily encumber or retard discovery or to impose unnecessary expense or burden on the opposing parties or the court.

To reiterate the lessons of the case: engage in an honest, thorough, and well documented discovery plan; think about retaining a third party to serve as an objective, knowledgeable voice; and scrutinize the implementation of processes (such as software upgrades) that endanger the integrity of the litigation hold.

Posted in 9th Circuit, Case Summary, Computer Forensics, Discovery Requests, Motion to Compel, N.D. Cal., Search Protocols | Tagged: , | Leave a Comment »

Computer system ‘not conducive’ to keyword search?

Posted by rjbiii on November 8, 2007

In reading 3M v. Kanbar, 2007 U.S. Dist. LEXIS 78374 (N.D. Cal. Oct. 10, 2007), an opinion posted at the Electronic Discovery Blog, we ran across this passage:

Upon reviewing the emails produced, the court appreciates 3M’s concern. However, given the assertions by Rollit’s counsel, compelling another search will not cure the problems inherent in a manual search. FN3 Therefore, the court orders (each) Defendant to sign a declaration certifying that all non-privileged, responsive documents have been produced.
The declaration shall detail what Rollit (and each other defendant) and its employees have done to ensure a complete production. Given the concern over the previous omission, Defendant(s) would do well to ensure that all responsive documents have been produced before signing. Accordingly, the motion as it pertains to compelling another search is GRANTED IN PART. The declarations shall be filed with the court within 7 days of the date of this order.

FN3: It does not seem that Rollit’s computer system is conducive to an order compelling an electronic keyword search.

(emphasis added)

I wonder to what unique attributes the court refers when it states that “Rollit’s computer system is [not] conducive to an order compelling an electronic keyword search?” There is no elaboration, but I would love to see what factors convinced the court of this…

Posted in 9th Circuit, Discovery Requests, N.D. Cal., Search Protocols | Tagged: | Leave a Comment »

Dealing with Search Criteria

Posted by rjbiii on November 8, 2007

A recent post of ours cautioned readers to be careful on formulating, and to use some method of verifying, their initial assumptions. We refer to initial assumptions with respect to EDD as assumptions on keywords, effective date ranges, and data sources that must be preserved for an electronic discovery project. has posted an article discussing keyword searches, and calls attention to one danger of not carefully considering the formulation of search criteria:

The results of a recent e-discovery keyword search should have come as no surprise. Working on a case related to a specific transaction, the attorneys requested production of all documents containing the word “buy.” Despite being cautioned against this broad search, they were reluctant to heed the warnings, and many unrelated documents were incorrectly deemed responsive. Unfortunately, it takes a $750,000 mistake like this one for some people to understand the benefits of using a strategic approach to keyword selection.

If this had been my project…well, never mind. As I have said repeatedly, it is essential for the initial assumptions used in extracting data for review to be thoroughly vetted, because the filter ultimately determines what documents the reviewer sees. Searches that are too broad cost time and money. Searches that are too narrow will miss vital data, and could cost the client even more in the long term (by skipping over helpful information or by landing them in hot water with the judge). The importance of the process of building a verifying a list should not be underestimated.

That said, keywords are not the panacea. New technologies, using concept-based ontologies and techniques continue to evolve, and will move us beyond the era of the boolean keyword search.

Posted in Articles, Best Practices, Cost of Discovery, Discovery, Duty to Produce, EDD Basics, Search Protocols, Trends | Leave a Comment »

Keeping it all (your data, that is) together

Posted by rjbiii on September 17, 2007

DM Review has posted an article discussing the challenges of navigating the rules of compliance on one side, and discovery rules on the other.:

Corporations were thus presented with a dubious choice, one that really wasn’t a choice at all: attempt to get the unstructured data genie back in the bottle in favor of the old paper-based world or lean heavily on technological tools to implement an infrastructure better equipped to handle both structured and unstructured data.

The author discusses the familiar issues with trying to find structure and patterns within unstructured data. Then, voila, something big happens:

the search and categorization industry grew up. After a few false starts and some premature hype, search and categorization tools became easier to use and, more importantly, started delivering better results. Search and categorization tools eventually became the unifying force of information management within many enterprises and professional service firms as they could make sense of huge volumes of data in a relatively effective fashion. Furthermore, search and categorization technology began solving particularly thorny issues such as records management, compliance and e-discovery, which went a long way toward cementing the critical role that search is playing in today’s enterprises. The following three brief case studies highlight the increasingly effective roles being played by search and categorization to resolve specific business issues.

In focusing on litigation, the article waxes a bit optimistic on the technology used for document review:

For the legal industry, time is money – literally. With associates’ billing rates exceeding $250/hour and partners’ upward of $500/hour, efficiency is critical. The challenge for law firms is that their incredibly valuable intellectual property (their work product and expertise) resides in multiple, separate repositories and applications, making information accessibility extremely difficult and time-consuming. Worse, particularly for large diversified firms bidding on new business, lawyers don’t know the full breadth of expertise living within the firm and will either spend a significant amount of time figuring this out or will simply avoid bringing in new clients for fear that the firm won’t be able to meet their extensive needs.

The solution: a search application that unifies access to all data within the firm in a single, easy-to-use interface, thereby giving access to all of the work product and expertise within that firm. This solution not only pulls information from the usual sources (file servers, databases and intranets) but incorporates highly sensitive sources (e.g., from time/billing systems and personnel records) and even external information feeds. And in order to meet the firm’s stringent ethical and conflict of interest-avoidance requirements, the system applies multiple levels of security to both the users of the system and the content residing in it. Thus, the legal industry has increasingly turned to this “Google for law firms” solution to make its practice far more efficient, thereby allowing them to raise their rates while actually improving their cost-effectiveness for clients.

Google for law firms, eh? I haven’t seen the killer app in lit support yet. In fact, many of the leading lights of law are just now beginning to acknowledge that “eyes only” review is not the most effective and accurate means of processing information out there. The Sedona Conference has released a new paper on using seach technology in the e-discovery process (download the report in pdf format by clicking here). An excellent view of the recommendations contained in the white paper may be found at e-Discovery Team.
What is certain is that technology associated with e-discovery still has a ways to come (although it has certainly progressed in the last few years). What is perhaps even more important, is that learned counsel become, well, learned. Greater knowledge of the technological capabilities and techniques, as well as familiarity with the laws of discovery procedure with regards to e-discovery, will result in much greater efficiencies and less nasty surprises for clients.

Posted in Articles, Data Management, Search Protocols, The Sedona Conference, Trends | Leave a Comment »