Blogs

Paul Kimmel's Blog

Using Multiple Where Clauses with LINQ

     

LINQ, or Language INtegrated Query, is an all purpose query language that is added right in your C# or VB code. You can use LINQ to query collections, SQL DataSets, XML, entities, and LINQ is extensible into other technologies like SharePoint and Active Directory. LINQ is a general purpose query technology without a lot of power and some subtle features. One such feature is the where clause. The where clause is used to filter data a lot like where clauses are used in SQL.

Where clauses in LINQ can have a single predicate, multiple predicates where each expression is combined with the Boolean And or Boolean Or operator. LINQ also supports multiple where clauses. You can use multiple where clauses in a single LINQ query to break up filtering in pieces. The utility in using multiple where clauses is that you can short circuit query processing, especially processor or IO expensive sub-operations, by adding additional where clauses that will stop processing part of the way through when a filter condition—a where predicate—fails.

File IO can be system intensive. Suppose you were to use a LINQ query to process information in the file system—reading the files that match a specific search pattern like *.txt and that counts the words in those files. Searching the file system, reading the contents of the matching files, and counting the words each represents a relatively expensive IO operation. You could use a where clause with Boolean operators to perform the filtering in a single where statement, but by splitting the work up with light weight checks early you can reduce the total amount of query processing. (An additional technique that works well here, too, is to use the let clause. Let can be used to assign temporary values that can be stored and processed once in each iteration, but the data can be used multiple times.)

In Listing 1, the file system is queried for text files. The firs where clause short circuits on empty files. For files with data all of the text is assigned to the temporary range variable content. The variable content is used to obtain the word count per file, and the second where clause us used to filter files that contain more than ten words. Finally, the project—the output data—contains the filename and the word count of each file that passes both tests.

Listing 1: A LINQ query with multiple where clauses short circuits on empty files—represented by the first where clause.

Imports System.IO
Imports System.Text
Imports Microsoft.VisualBasic.FileIO

Module Module1

  Sub Main()
    Dim wordCount = From filename In Directory.GetFiles("C:\temp", "*.txt", System.IO.SearchOption.AllDirectories) _
                    Where FileSystem.GetFileInfo(filename).Length > 0 _
                    Let content = File.ReadAllText(filename) _
                    Let words = content.Split(",", ".", ";", ":", "!", ".", " ", "/", "?") _
                    Where words.Length > 10 _
                    Select New With {.File = filename, .WordCount = words.Length}

    For Each item In wordCount
      Console.WriteLine(item)
    Next
    Console.ReadLine()
  End Sub

End Module

Published Dec 24 2009, 02:39 PM by Paul Kimmel (DevExpress)
Bookmark and Share

Comments

 

Michael Proctor [DX-Squad] said:

Firstly a Merry Christmas to your Paul, Secondly I have started reading through the ASP.NET book, I am glad to get to understand the ASP.NET aspects of life, and have already picked up on a few mistakes that I use to do so thanks.

Just noting that your source sample is VB.NET, I am curious, are you a fellow VBer or just trying to keep us VBers happy?

As for the actual topic ;) is there any performance difference to doing it using LINQ rather than a For Each statement? I am still unfortunately stuck on .NET2 hoping to upgrade our project to 3.5 mid next year. Ultimately I could move this up if I could only find some decent research on .NET 3.5 penetration.

Well it is 4am here on Xmas morning so I better get back to bed before the kids and wife start hassling me :)

December 24, 2009 1:07 PM
 

Paul Kimmel (DevExpress) said:

I have been a VB MVP for 5 years and programming in VB off and on to varying degrees since 1978-ROM BASIC. I have used ROM BASIC, QuickBasic, GW-BAsic Basic 7.1, VB for DOS, all the way up to today. I have written the VB Today column for codeguru.com for 10 years. I switch up because some of you guys have asked for examples in VB too. Merry Christmas. Thanks for buying the book and thanks for taking the time to write. Have you gotten you v2009 volume 3 upgrade yet?

December 26, 2009 2:43 PM
More from DevExpress
Live Chat
Have a pre-sales question?
Need assistance with your evaluation?
We are here to help.
Chat is one of the many ways you can contact members of the DevExpress Team. We are available Monday-Friday between 8:30am and 5:00pm Pacific Time.
If you need additional product information, require pre-sales assistance, or want help with your order, write to us at info@devexpress.com or call us at
+1 (818) 844-3383.