Author Topic: A way to deal with large datasets and performance problems  (Read 3364 times)

Offline Joscha

  • Full Member
  • ***
  • Join Date: May 2016
  • Posts: 10
  • Forum Citizenship: +0/-0
I worked on an active report for a long time and had major performance problems. The active report was based on a very big dataset, with a lot of different dimensions (time, three spatial dimensions with a total of 400 areas, different indicators, different measures,...).

The report consisted of two customized visualizations (map and bar Chart) and two lists. It was supposed to work like this:
1. Select an indicator using some nested dropdwon lists
2. map shows measures of the indicator for all areas of one selected spatial dimension
3. click on one area on the map
4. list shows all available indicators for the selected area...

Everything worked as intended, but the performance was awful. The report had a size of 40MB, took 35 hours (!!!) to create, two minutes to load and setting a filter led to a lag up to 15 seconds.   
I read a lot about performance issues online and came to the conclusion that the performance was terrible because of my lists. I used a combination of nesting the lists in datadecks and filtering the lists via variables...
The visualizations on the other Hand worked fine, absolutely no performance problems here... that is where i started thinking: why not build a new rave-visualization, which works like a list?
Long Story short: i created a rave visualization which shows exactly the same data that was shown in one row in the "standard" list in my report. All filtering and nesting in different datadecks can be done directly by applying filters to the visualization... it works like a charm!

Comparison old report using lists / new report using custom visualizations:
File-Size: "old" report: 42MB / "new" report: 15MB
Time to create: "old": 35 hours / "new": 3 hours
time to load: "old": 2 minutes / "new": 2 seconds

This is one very simple Viz I created (I did some others with higher complexity, but this one is the ideal starting Point for everyone who wants to go the same way as I did).

Code: [Select]
{
   "copyright":"Joscha",
   "data":
   [
      {
         "id":"originalwerte",
         "fields":
         [
            {
               "id":"originalwert",
               "label":""
            },
           
            {
               "id":"einheit",
               "label":"",
               "categories":
               [
                  " Einwohner"
               ]
            }
         ],
         "rows":
         [
            [
               11,
               null
               
            ]
         ]
      }
   ],
   "grammar":
   [
      {
         "coordinates":
         {
            "dimensions":
            [
               {
                  "scale":
                  {
                     "spans":
                     [
                        {
                           "fit":"exact"
                        }
                     ],
                     "local":true
                  }
               },
               {
                  "scale":
                  {
                     "spans":
                     [
                        {
                           "fit":"exact"
                        }
                     ],
                     "local":true
                  }
               }
            ]
         },
         "elements":
         [
            {
               "type":"point",
               "position":
               [
                  {
                     "value":"50%"
                  },
                  {
                     "value":"50%"
                  }
               ],
               "style":
               {
                  "symbol":"rectangle",
                  "width":"30px",
                  "height":"26px",
                  "fill":"transparent"
               },
               "label":
               [
                  {
                     "content":
                     [
                        {
                           "$ref":"originalwert"
                        }
                     ],
                     "style":
                     {
                        "align":"middle",
                        "location":"fit",
                        "font":
                        {
                           "weight":"normal",
                           "size":"11pt",
                           "family":"Helvetica"
                        },
                        "fill":"#000000"
                     }
                  }
               ]
            }
         ],
         "style":
         {
            "fill":"transparent"
         }
      }
   ],
   "version":"7.2",
   "size":
   {
      "width":"22",
      "height":"22"
   },
   "style":
   {
      "padding":"0%",
      "fill":"transparent"
   }
}


Maybe this post will help anybody looking for a way to deal with performace problems...

Best
Joscha
« Last Edit: 30 Aug 2017 12:03:09 am by Joscha »