1

how can I handle Out Of Memory in 50million record nhibernate, I used stateless sessions in NHibernate and solve around ~1.7 million records but its not good for my report. and another thing is I need to calculate on them so I can't take apart the result. Could you let me know how to optimize it?

        public IEnumerable<ATMDispenseReport> GetReport(string WhereClause)
    {

        var CurrentUnitOfWork = this.UnitOfWork as MainUnitOfWork;

        var session = CurrentUnitOfWork.StatelessSession;

        string SqlQuery = "MyQuery";

        var lstATMDispenseReport =session.CreateSQLQuery(SqlQuery)       .SetResultTransformer(Transformers.AliasToBean<ATMDispenseReport>())
            .List<ATMDispenseReport>();

        return lstATMDispenseReport;
    }

Edit: Final Solution was using DBMS to handle the query calculation.

Ali Kianoor
  • 1,167
  • 9
  • 18
  • Could you give an example how your data looks like and what you want to calculate? Maybe there is a way to do the calculations on database site, initiated by appropriate aggregations in your sql query. – Patrick Sep 19 '16 at 11:42
  • The "O" in ORM stands for Object. Objects have bevahour. Reports do not have objects because they don't have any behaviour. Even if you insist to use an ORM for reporting though, there's no way you can create a report with 1.7M rows - no human (or machine) would be able to use this. What are you trying to do? Loading 1.7M rows in RAM won't make the report go faster than a SQL statement that returns only the final report data – Panagiotis Kanavos Sep 19 '16 at 12:53
  • 50M records are a very, VERY strong indicator that you need a proper reporting schema and a proper ETL process that fills it. – Panagiotis Kanavos Sep 19 '16 at 12:54
  • Thanks Patrick & Panagiotis Kanavos :) – Ali Kianoor Sep 21 '16 at 13:33
  • @PanagiotisKanavos Do you think it will be a good idea to have 2 datatables to load 16 millions and 8 millions of data? – I Love Stackoverflow Feb 15 '18 at 10:18
  • @User no, I think you need a proper database schema, indexes and well written queries so you *don't* need to load 24M rows to the client. 24M isn't a lot of data for a database. Star schemas are built to deal with hundreds of GB of data. Columnstore indexes, availabel in all editions (even LocalDB) since SQL Server 2016 SP1 make it extremely fast to run grouping queries directly against the tables without Analysis services – Panagiotis Kanavos Feb 15 '18 at 10:21
  • @PanagiotisKanavos Actually i already have an index on database.You can check out this question which i have posted here:https://stackoverflow.com/questions/48785641/what-is-the-best-way-to-load-huge-result-set-in-memory – I Love Stackoverflow Feb 15 '18 at 10:24
  • @User I have 10 tables of 16M each with joins between them and don't load them to the client. If you think you need to load 24M rows to the client for processing, your schema and queries need work. Loading all of that to the client is the **problem** not the solution – Panagiotis Kanavos Feb 15 '18 at 10:25
  • @PanagiotisKanavos sir but my usecase is different so what i am trying to do is i want to compare 12 millions of records from sql server database table with 12 millions of records from oracle database table which is also having 12 millions of records.In order to do this i have to compare them in memory.Right? – I Love Stackoverflow Feb 15 '18 at 10:27
  • @User don't hijack other questions then. And *do* use SSIS – Panagiotis Kanavos Feb 15 '18 at 10:29
  • @PanagiotisKanavos I believe that the best solution is using DBMS to handle this. – Ali Kianoor Sep 30 '19 at 09:02
  • Actually I found using DBMS can handle my issue at that time. @PanagiotisKanavos Thank you anyway. and thanks to the Groo 3rd Solution. – Ali Kianoor Mar 18 '20 at 05:33

1 Answers1

0

I am presuming you are running on x86 (32-bit). An array (or a List<T>) in .NET has to be stored in a contiguous block of memory, so a list of 50 milion items will need a single block of more than 200MB just to store the addresses of each instance of ATMDispenseReport (and this will occur after several instantiations, because lists grow dynamically).

Although a 32-bit app has a 2GB memory limit, if you already have a bunch of stuff and the memory is fragmented, it's possible that you don't have contiguous area that is large enough. And then you also have 50 million of ATMDispenseReport objects, which don't need to be allocated contiguously, but they are probably at least 10 times larger than the 4 byte pointer.

  1. Dirty hack? Switch to x64 and check if you have enough RAM in this machine. You might need <gcAllowVeryLargeObjects> too. Note that, for an object of average size of ~40B you will need more than 2GB of RAM. That's seems a bit weird for a single query, since I doubt your user really needs all these rows at once.

  2. Simple solution? Replace List<T>() with Enumerable<T>() and enumerate results, if you really need to enumerate all results. This will stream the data lazily, so you only need to keep a single element in memory at any time. Remember, however, that all this data still needs to be delivered through the socket your app is using to connect to the DBMS server. We are probably talking about several GBs of data.

  3. Better solution? Rewrite your SQL to aggregate all data at the SQL side, before it gets back to your app. Let the DBMS server do all the job.

vgru
  • 49,838
  • 16
  • 120
  • 201
  • Thank you Groo very much i tried solution 1 before this post and its not work as well but i think solution 2 fix my problem. – Ali Kianoor Sep 21 '16 at 13:32