|
|
|
Posted
over 3 years
ago
by
nore...@blogger.com (Alex Yakunin)
We implemented this feature few weeks ago. Imagine we execute the following code: var query = from customer in Query<Customer>.All select new { Customer = customer, First5Orders = ( from order in
... [More]
Query<Order>.All where order.Customer==customer orderby order.Id select order ).Take(5) }; var queryResult = query.ToList(); // Actual execution
Console.WriteLine("queryResult.Count: {0}", queryResult.Count); foreach (var item in queryResult) { var subqueryResult = item.First5Orders.ToList(); // Actual execution must happen here, // but see the comments below. Console.WriteLine("subqueryResult.Count: {0}", subqueryResult.Count); } As you see, this is a typical case where you must get 1+N queries:
First query will be the main one All the others are its subqueries. As far as we know, any other ORM will execute a particular one of them on attempt to enumerate.
So e.g. if queryResult.Count==90, you must get 91 queries - a particular example of "Select N+1" issue.
But DO4 will send just 6 batches!
The first one is: SELECT [a].[CustomerId], [a].[TypeId], [a].[CompanyName], [a].[ContactName], [a].[ContactTitle], [a].[Address.StreetAddress], [a].[Address.City], [a].[Address.Region], [a].[Address.PostalCode], [a].[Address.Country], [a].[Phone], [a].[Fax] FROM [dbo].[Customers] [a];
All the subsequent ones look like this: exec sp_executesql N'SELECT TOP 5 [a].[OrderId], [a].[TypeId], [a].[ProcessingTime], [a].[ShipVia.Id], [a].[Employee.Id], [a].[Customer.Id], [a].[OrderDate], [a].[RequiredDate], [a].[ShippedDate], [a].[Freight], [a].[ShipName], [a].[ShippingAddress.StreetAddress], [a].[ShippingAddress.City], [a].[ShippingAddress.Region], [a].[ShippingAddress.PostalCode], [a].[ShippingAddress.Country] FROM [dbo].[Order] [a] WHERE ([a].[Customer.Id] = @p1_0) ORDER BY [a].[OrderId] ASC;
-- ... -- A set of similar queries is skipped to shorten the output -- ...
SELECT TOP 5 [a].[OrderId], [a].[TypeId], [a].[ProcessingTime], [a].[ShipVia.Id], [a].[Employee.Id], [a].[Customer.Id], [a].[OrderDate], [a].[RequiredDate], [a].[ShippedDate], [a].[Freight], [a].[ShipName], [a].[ShippingAddress.StreetAddress], [a].[ShippingAddress.City], [a].[ShippingAddress.Region], [a].[ShippingAddress.PostalCode], [a].[ShippingAddress.Country] FROM [dbo].[Order] [a] WHERE ([a].[Customer.Id] = @p16_0) ORDER BY [a].[OrderId] ASC; ',N'@p1_0 nvarchar(5),@p2_0 nvarchar(5),@p3_0 nvarchar(5),@p4_0 nvarchar(5),@p5_0 nvarchar(5),@p6_0 nvarchar(5),@p7_0 nvarchar(5),@p8_0 nvarchar(5),@p9_0 nvarchar(5),@p10_0 nvarchar(5),@p11_0 nvarchar(5),@p12_0 nvarchar(5),@p13_0 nvarchar(5),@p14_0 nvarchar(5),@p15_0 nvarchar(5),@p16_0 nvarchar(5)', @p1_0=N'ALFKI',@p2_0=N'ANATR',@p3_0=N'ANTON',@p4_0=N'AROUT',@p5_0=N'BERGS',@p6_0=N'BLAUS', @p7_0=N'BLONP',@p8_0=N'BOLID',@p9_0=N'BONAP',@p10_0=N'BOTTM',@p11_0=N'BSBEV',@p12_0=N'CACTU', @p13_0=N'CENTC',@p14_0=N'CHOPS',@p15_0=N'COMMI',@p16_0=N'CONSH'
As you see, we execute such subqueries as future queries - i.e. they're performed in batches. This does not mean we materialize the whole query result at once - instead, we process it part by part:
When you pull out the first item, we materialize first 16 items & cache them. If there are subqueries, they're processed as future queries transparently for you. When you pull out 16th item, we materialize 32 more of them at once by the same fashion. And so on; maximal size of such a bulk is 1024. Note that we called .ToList() here, so it was actually fully enumerated at that moment, and thus all the batches were executed during .ToList() processing. But if we'd use it in foreach loop and break from it, only a part of result would be materialized.
So such a materialization process allows us to optimize the interaction with RDBMS (reduce the chattiness) transparently for you. The process is fully recursive - so e.g. if subquery contains other subqueries, they'll be resolved by the same fashion. Moreover, if you select EntitySet in final selector, it is prefetched by the same way.
So this is a good alternative to prefetch API. [Less]
|
|
Posted
over 3 years
ago
by
Dmitri Maximov
|
|
Posted
over 3 years
ago
by
Dmitri Maximov
|
|
Posted
over 3 years
ago
by
nore...@blogger.com (Alex Yakunin)
Finally I decided to surrender and start following "visual path". Visual tools are what people expect now - frequently it does not matter how good your product / API is: no visual tools = there is nothing to speak about. Or, differently, if
... [More]
you're so cool, why you don't deliver visual designers, which are commonly expected? And you know, it's impossible to argue with this. It's simply clear we need them.
The next question is what people really expect from us. Currently I see the following weak areas:
Visual model designer
That's what business analysts and beginners expect first of all. We're going to build it over Visual Studio DSL Tools. I always though this is pretty hard problem, but after looking on DSL tools closer I discovered it must be much simpler than I though. Moreover, MEScontrol developers successfully use model designer built by MESware for integrators (it generates v3.9 models) - it is based on DSL tools. So there is already a kind of prototype we can look at. Let's describe what our designer must do:
Design persistent models - add, edit and remove persistent classes (Entities, Structures and EntitySets) and properties there;
Support application of all DataObjects.Net attributes, such as [Association]; it must be possible to define mappings there as well. Provide customizable T4 template-based code generation of entity code. Obviously, we'll generate partial classes you can extend with your own code. Support modelling of IUpgradeHandlers, and, likely, even their automatic updates on changes in model. It must be possible to reference externally defined persistent types there, including custom-typed EntitySets. This will allow to build separated models for each part of the application. It should support reverse engineering - a feature allowing to (re)generate the model from existing database.
Query profiler
Possibly, this is even more important part. We're going to combine query debugging features of profiler we hav in v3.9 (if you don't know, it is very similar to LINQPad) and tracing features of NHProf to make a process of debugging DO4-based applications really simple and productive. We expect it providing the following features:
Possibility to attach a profiler to any remote DataObjects.Net Domain, if profiling is enabled in its configuration. Event tracing. Nearly the same UI as in SQL Server Profiler, although I'd like to see better categorization, filtration and grouping features there. Event analysis. Basically, we must be able to attach integrated & custom analyzers to the event streams we produce, and I hope Rx Framework will help us a lot here. Since results of analysis are event streams as well, their visualization must be similar to event tracing. Custom code execution. Yes, I'd like this to be possible. DataObjects.Net profiling API must allow profiler to push C# \ VB.NET code to the server and execute it there capturing all the event. Result visualization. Such custom code must be able to return results back to the profiler for visualization. I feel this is one of the most important and complex parts there: I'd like to see which properties of persistent objects are loaded and which are not, explore the relations there, support very large collections of entities, and, moreover, I'd like to be able to edit everything there. Likely, later it will support other ORM tools. But our initial goal is to perfectly support just DO4.
So profiler must act not just as tracing & debugging tool, but nearly as SQL Server Management Studio: in fact, it allows you to do everything except changing the model.
Timeframe: we're going to start works on both parts from the beginning of December; visual designer is of #1 priority, so I hope we'll be able to show its alpha by the end of this year. Likely, this will delay some of planned v4.2 features, but not quite: I hope Alex Ilyin (LiveUI author, he will join DO4 team for few months) will help us a lot with this.
Any ideas and opinions are welcome. [Less]
|
|
Posted
over 3 years
ago
by
nore...@blogger.com (Alex Yakunin)
Likely, you have noticed there were no nightly builds on the previous week. Now the issue is fixed, and they're back again. Moreover, v4.1 installer is updated to the today's nightly build - I assure you it is very stable (~ 6-7 tests are failing
... [More]
there dependently on configuration - that's normal, they show tests for works in progress and few known issues).
If you have some time, please download and install the latest build: your feedback would significantly help us to deliver stable v4.1 release on the end of this week. As you know, installer was one of the most disappointing parts we had so far, and I hope the current one won't suffer from any problems of its predecessors. Please compare your own installation results with expected ones.
I know what I suggest is quite similar to "help yourselves" ;) But... I feel there is no other good way to do this. We have a limited set of configurations, and although everything works locally, I still get reports & fix some pretty strange issues there. E.g. few days ago one of DO4 users helped us to identify a serious bug preventing DO4 from installing on 32-bit Vista. I can't imagine why there was "HKLM\SOFTWARE\Wow6432Node\Microsoft\VisualStudio\9.0" subkey on 32-bit Windows, but it is the reason of the problem. And I suspect this isn't a very unique case - similar issue was described at our support forum earlier, but that time we were unable to identify the bug. So testing installer for complex framework is complex. But I hope we'll accomplish this - at least by this way ;) [Less]
|
|
Posted
over 3 years
ago
by
nore...@blogger.com (Alex Yakunin)
If you ever uninstalled DataObjects.Net, you know we run "exit poll" there. And so far it was constantly highlighting two most annoying categories of issues: Documentation. Mainly, you indicated there must be a manual you can read
... [More]
sequentially. Installer. It was quite buggy. A month ago there were issues even on 32-bit Vista; users of 64-bit Windows had almost zero chances of getting DO4 samples running.
I glad to say both of these issues must disappear with release of v4.1. As you know, we're working on Manual - it is missing just few chapters now. Installer was quite significantly improved during October-November; its current version is free of all the identified bugs. And... I hope you'll help us to test it.
So upcoming v4.1 looks promising. I hope you'll like using it ;)
P.S. Actually we made one more conclusion: we need a visual designer and support for reverse engineering (conversion of generally arbitrary schema to our persistent types). But that's the topic for the next post. [Less]
|
|
Posted
over 3 years
ago
by
Dmitri Maximov
|
|
Posted
over 3 years
ago
by
nore...@blogger.com (Alex Yakunin)
Few days ago Microsoft StreamInsight was presented on Urals .NET User Group. Nikita Samgunov, the author of presentation, have made a really good overview of its features and architecture, so it was really interesting. But I left it
... [More]
confused: the same day Rx Framework became available, so I had a chance to look on it much closer - and I was really impressed. It was clear StreamInsight solves almost identical problem, but there are significant differences:
StreamInsight is positioned as CEP framework. Rx is positioned as general purpose event processing framework. This must mean StreamInsight should solve some specific problems much better than Rx. But as it initially looks like, this is arguable. StreamInsight uses LINQ-to-CepStream. CepStream is IQueryable (we checked this - e.g. it fails on translation of ToString() \ GetHashCode()), so there is a LINQ translator for it. But Rx uses custom LINQ extension methods to IObservable monad. So StreamInsight builds a program transforming CepStreams while compiling the expression stored in IQueryable, but Rx does the same "on the fly" - event crunching machine gets built while LINQ combinators are applied to IObservables one after another. It seems StreamInsight adds implicit conditions in some cases, such as "event intervals must overlap" for joins. I'm not fully sure if this is really correct, since I didn't study StreamInsight so closely, but at least it was looking so. Of course, the same is possible with Rx, but you must do this explicitly.
But there are many similarities as well:
Both frameworks run all the calculations in memory. There is no persistent state. Both frameworks can be hosted inside any application. Both frameworks are ready for concurrent event processing. It seems StreamInsight executes everything in parallel by default; the same is possible in Rx, but what's more important, concurrency is fully controllable there (btw, that's really impressive advantage of Rx: events are asynchronous by their nature, so concurrency looks much more natural there, than e.g. in PLINQ).
After summarizing all this stuff for my own, I made the following conclusion: Rx is what definitely worth to be studied (it seems I wrote that earlier ;) ), but StreamInsight... Is, possibly, a dead evolutional chain.
First of all, Rx seems much more powerful and generic. I really can't imagine why I should compile the queries. If there is Rx, an approach provided by StreamInsight looks like writing enumerable.ToQueryable().[Your query].ToEnumerable() instead of just enumerable.[Your query].
Moreover, I clearly understand how complex this problem is - to write a custom LINQ translator. Most of developers are able to write their own extension methods to IEnumerable; many are capable to rewrite the whole LINQ to enumerable. But there are just a handful of teams that wrote their own LINQ translator. So presence of this layer in StreamInsight looks as unnecessary complexity. If so, it will evolve much slower, will be more complex to extend (e.g. with Rx you can use custom methods without any additional code; but the same won't work with StreamInsight), etc.
On the other hand, there is a statement from leaders of both teams stating that both frameworks are useful.
I think Erik Meijer should simply say: "occasionally we kicked StreamInsight team's ass" - but certainly, this seems not what really possible ;) Phrases like "It (Rx) is particularly useful to reify discrete GUI events and asynchronous computations as first class values." looks especially funny, if you seen any videos with Erik Meijer on Rx (this one is a very good intro): Rx application area is much wider than just GUI event processing. In fact, Rx offers a new language (DSL inside C#, F# and so on) allowing you to describe asynchronous computations much more naturally (or distributed ones, such as Paxos). Taking into account we can increase just CPU count (or machine count) now, but not their frequency, Rx appearance seems very important. If you looked up the video, you should remember Erik Meijer said, that, likely, he can retire right now - he's almost fully sure this is the nicest abstraction he invented.
So... Study Rx, not StreamInsight ;)
P.S. Amazing, how few persons (or may be, even just a single person) may change the way we think. And how fast a new technology they built can kill the technology it was originated from. [Less]
|
|
Posted
over 3 years
ago
by
nore...@blogger.com (Alex Yakunin)
That's the feature I always wanted to see! This means now we can use the same .csproj files to target both .NET 4 and Silverlight. Earlier it was much more complex.
Here is full list of new features in Silverlight 4 beta.
|
|
Posted
over 3 years
ago
by
nore...@blogger.com (Alex Yakunin)
We're going to release v4.1 on the next week: Features that currently exist there will be polished until that moment. We'll finish the Manual. DisconnectedState will be fully operational; WPF sample will rely on it. And
... [More]
finally, Azure SQL will be fully supported.
One more good news is that a part of our team has already switched to the next set of features scheduled for v4.2:
Nested transactions. They'll allow you to rollback a part of the changes you made. As before, all the modifications made to entities will be returned back transparently. Global cache. Mainly, we're going to provide an API allowing to plug any implementation there. Initially we'll support integrated LRU cache (nearly as in v3.9) + Velocity. Flexible expiration policy, cacheable queries, version checks only on writes are among features we're going to implement here. Btw, most complex part we need here is already done: DisconnectedState utilizes exactly the same API (SessionHandler replacement) to make Session exposing the data it caches. Localization. Initially it will be impossible to use different collations for different localizations, but the API itself will be much better (and much more explicit) than in v3.9. Access control system. As you may find, we are going to provide an API, that will be much more "open". Moreover, the ACL structure there will be fully relational, so you'll be able to utilize them in queries (e.g. to return only the objects you can access). O2O mapping. The idea is fully described, so please see the link. We have finally decided this is the best option. Another idea is to support WCF serialization right by our entities: this approach has a set of disadvantages - you'll be able to marshal entities only "as is"; so e.g. you won't be able to expose the same entity differently for different services or versions of API. The advantage is that no additional coding is necessary, if you want to marshal entities. So in general, O2O looks better here: it provides much better flexibility. The same set of server-side entities and various DTOs for different WCF APIs (or its parts - e.g. for CustomerForm, SalesReportForm and so on) seems almost ideal solution.
So we're going forward, and v4.2 will be one more very important milestone for us. In fact, it will bring almost all the features we had in v3.9 (the only left ones are full-text search and partitioning), and will allow using it with both WCF and ADO.NET Data Services. Likely, .NET RIA Services will also be supported after getting O2O mapping done, so we'll provide a complete spectra of supported communication APIs. [Less]
|