Standardization of data

ctodx
12 November 2008

Over the past few months I've read a few blog posts decrying the good ol' Northwind database (stemming, I think, from Scott Hanselman's original post) and saying "we" (that is, the entire .NET development world) need something new and different. None of that Products and Suppliers stuff, it's so passé; we need something new.

Actually, speaking from behind my desk with my DevExpress cap firmly on my head (photos to follow -- you mean you didn't get one at PDC?), I say phooey. We vendors not only need Northwind, but we need a richer Northwind.

The pros of Northwind are simple to enunciate:

  • The domain is easily comprehensible. It's orders with items, it's products being sold, it's customers buying them. All developers can relate to this: you don't even have to think about the domain.
  • Since all vendors use it to a greater or lesser degree for their demos, it's automatically familiar, which apart from reinforcing the first point, also means that the developer being demoed to can just concentrate on the vendor's spiel. After all, we don't support and go to tradeshows like TechEd and PDC to sell Northwind.
  • Since it's so widespread, it becomes part of the benchmark for evaluating similar products from different vendors. If you are looking for a grid that does master/detail views and are evaluating A, B, C, and DX's versions, then you already know you've got to plug in the Orders table and the Order Details table.
  • Since it's available, it makes it easier: vendors don't have to invent, copy, or plagiarize any of this data from other places or each other in order to show off their wares.

In essence, it's a standard.

So what does it need? Again with my vendor cap firmly in position:

  • Multimedia data. One reason Northwind is so old-fashioned is that there are no images, audio tracks, or videos in the database.
  • Textual or memo data. Something more than a one sentence description, in other words. This could be used for showing off rich text editors, spell checkers, mail merge, etc, etc.
  • Lots of data. And I mean lots, as in many, many thousands of records. it's only through having some standard data like this that you can evaluate performance.
  • PIM-style data. You know, appointments, contacts, that kind of thing. Can't sell a scheduler control without it.
  • Data that can be charted. This is huge since there are so many different chart types. So you should have data that can be charted with bars, pies, lines; project data for Gantt charts; historical stock data for the financial charts; and so on.
  • Real-time data, that is, data that is being updated in real-time. In the past, we've hooked into the performance counters on the demo machine to show off our performance with real-time data.

As you can see, in order for us to show off our controls, we have to spend resources and time to invent or generate an awful lot of data (and it's not like we can sell this to recoup). Northwind is just not broad or rich enough.

We -- and you -- need standard demo data, for without it, you'll be comparing apples and oranges. Having said that, using standard data means that you run the risk of vendor code being written to work at its best with it, and if you veer off the straight and narrow the vendor's code might start to work less well. Correspondingly, if the vendor invented or generated the data, it's very likely to be invented just to show off the product being demoed. Overall, though I think standard data wins out.

7 comment(s)
Anonymous
Adam

I feel like DevExpress has set the standard in so many respects.  DxCore comes to mind.  Maybe you could offer your prize dataset to all comers and this may even bring in widespread attention from non-developers such as DBAs.

12 November, 2008
Paul Fuller
Paul Fuller

Good comments Julian.  I agree with your main points that having a 'standard' set of demo data is good for us and that it could do with some improvements.

BTW - Are you aware of the AdventureWorks demo data?  It fulfills some of your wish list items.

The other thing that would constitute a major improvement is to have say monthly update packs.  Showing data from 1998 in today's terms is pretty annoying.

Wouldn't it be great to have a living breathing (although synthetic) company represented in the data.  Staff join and leave, get promoted.  New products are added and old ones drop off.  Customers orders grow, shrink and change in nature.  Dirty data creeps in (maybe that is going too far !).

I'd subscribe to Northwind 2009Q1, Northwind 2009Q2 etc if it wasn't too expensive.

Cheers

12 November, 2008
Anonymous
Heather

Hi Julian,

Thanks for your thoughts on the subject.  I think a few of us may feel like Scott Hanselman about Northwind.  It is just too outdated on several levels.

You guys are also missing out the opportunity to demonstrate serious product differentiation you guys might have with more modern implementations of SQL (i.e. Sql Server 2008).  The way you designed a physical database in 1997 with a common denominator as Access is very different than the way you SHOULD design a physical database say with Sql Server 2008.

Here is the link to Scott's blog post - you may have already read it but he makes some very valid points.

www.hanselman.com/.../CommunityCallToActionNOTNorthwind.aspx

Part of me definitely agrees with the need for standards but standard need to evolve or we are stuck "worshiping" something that time has forgotten or if you will the proverbial data model to nowhere... :)

13 November, 2008
Linton
Linton

Good thoughts, Julian.

As Paul mentioned, AdventureWorks provides a rich start:

"The AdventureWorks database supports standard online transaction processing scenarios for a ficticious bicycle manufacturer (Adventure Works Cycles). Scenarios include Manufacturing, Sales, Purchasing, Product Management, Contact Management, and Human Resources."

It can be downloaded here:

www.codeplex.com/MSFTDBProdSamples

13 November, 2008
Anonymous
Martin Brekhof

I think you would be better off creating an industry standard domain model (opensourcing  community.devexpress.com/.../writing-domain-components-framework-dc.aspx  ???) then people like EMS (sqlmanager.net/.../datagenerator) would be likely to deliver data-generators for each and every conceivable need/purpose.

For example: test data that covers US Zip codes is rather useless for me as I'm Dutch and so are my customers. It would probably be to cumbersome for DevExpress to adapt the data for each and every country theire customers are in. However it would be just business as usual for those who earn a living making datagenerators.

13 November, 2008
Julian Bucknall (DevExpress)
Julian Bucknall (DevExpress)

AdventureWorks: I'm sure we've looked at that, but I'll ping and make doubly sure.

Cheers, Julian

13 November, 2008
Anonymous
Guido Volkmann

Oh yeah... I'm sure, that's the birth of the "Westwind Coporate" ;-)

14 November, 2008

Please login or register to post comments.