If you are unable to create a new account, please email support@bspsoftware.com

 

Dashboard performs less on Oracle database than on in-cognos Datasets

Started by moos_93, 05 Dec 2025 04:00:57 AM

Previous topic - Next topic

moos_93

Good morning!

As the title says: I have made a number of datasets in Cognos, and connected them logically in a data module. On this module I built a dashboard that performs perfectly.

To save the local cognos server, I asked our local ETL specialist to recreated the datasets with ETL, and to write them off in an Oracle database. These tables I have then connected again in a data module, in the exact same way as I did with the datasets. On this module I built an identical dashboard as well. The performance of this dashboard is absolutely horrable, and most visualisations time-out on loading.

The fact table in question is a table with about 130.000 records, using several foreign keys for which dimensions were built. Most of these dimensions are 1:N to the fact table, but I also built a date dimension that is N:N, covering the time between START_DATE and END_DATE. When using this dimension, the number of records increases to 170 million.

Am I doing something wrong, or is the discrepancy in performance between the Oracle-database and the Datasets to be expected?

Thanks for your insights!

dougp

That is expected.  The dashboard that gets data from the database must perform these tasks for every visualization every time the user touches something.
write SQL
connect to the database server
run a query that involves joins between multiple tables
get a response back
update the viz

Put another way, when you are connecting to the database:
Cognos must write a more complex query.
Cognos spends time communicating with the db server.
the database server must perform lookups and filters across joins.
Cognos spends more time communicating with the db server.
Cognos waits for the data to be downloaded across the network.

In contrast, using a Cognos dataset means
write SQL
get data from a single table
update the viz

Choosing to use a Cognos dataset involves considering the tradeoffs between size and speed.  For most cases, a dataset will perform much faster than a direct database connection.


This is the same as using Power BI and comparing Import vs. Direct Query.



bus_pass_man

Yes I think you're doing something wrong.  You don't mention a bridge table for your N:N relationship, which you probably would want to have for any legitimate bridge table scenario but I really don't think this is one of them. 

Quotecovering the time between START_DATE and END_DATE.
Can you clarify what you mean by this.  Is this a duration?  What are you trying to do here?

Also is the N:N in the time dimension or the relationship between it and the fact table?  You are unclear.


I can not comment about possible other modelling problems because I have not seen your model and all the information I have is what you have chosen to reveal.



One very big problem with data sets is that you can't define data security on them, unlike tables in a loaded schema.

moos_93

Quote from: dougp on 05 Dec 2025 12:22:43 PMThat is expected.  The dashboard that gets data from the database must perform these tasks for every visualization every time the user touches something.
write SQL
connect to the database server
run a query that involves joins between multiple tables
get a response back
update the viz

Put another way, when you are connecting to the database:
Cognos must write a more complex query.
Cognos spends time communicating with the db server.
the database server must perform lookups and filters across joins.
Cognos spends more time communicating with the db server.
Cognos waits for the data to be downloaded across the network.

In contrast, using a Cognos dataset means
write SQL
get data from a single table
update the viz

Choosing to use a Cognos dataset involves considering the tradeoffs between size and speed.  For most cases, a dataset will perform much faster than a direct database connection.


This is the same as using Power BI and comparing Import vs. Direct Query.




Thanks for your reply. So this logic can explain why the dataset-solution would show a visualisation after 10 seconds, but the oracle-solution times out after a few minutes?

moos_93

Quote from: bus_pass_man on 05 Dec 2025 06:08:47 PMYes I think you're doing something wrong.  You don't mention a bridge table for your N:N relationship, which you probably would want to have for any legitimate bridge table scenario but I really don't think this is one of them. 
  Can you clarify what you mean by this.  Is this a duration?  What are you trying to do here?

Also is the N:N in the time dimension or the relationship between it and the fact table?  You are unclear.


I can not comment about possible other modelling problems because I have not seen your model and all the information I have is what you have chosen to reveal.



One very big problem with data sets is that you can't define data security on them, unlike tables in a loaded schema.

Thanks for your response. The START_DATE and END_DATE indeed show the duration of a trajectory. Both columns are used for different use-cases. In this case I want to report on the number of active trajectories at a given time, using a N:N connection between the two dates and a year-date calendar, s.t.:

FACT1.START_DATE >= CALENDAR.DATE
FACT1.EIND_DATE <= CALENDAR.DATE


A trajectory typically spans several months. I have considered using a year-month calendar instead, but in the dataset-solution the response time was very acceptable when using a year-date calendar. Furthermore, a year-date calendar would leave more of the original data, since in different use-cases reporting on date level is required. Adding several calendars to describe the same time(span) would make working with the data too complex for end users.


bus_pass_man



I don't know what you mean by 'trajectories' and I think I don't need to know, although knowing would be nice.

You want the count of trajectories where start date >= {some date} and end date <= {some other date} is that a correct understanding?  That is easily done, without mucking about with many to many relationships.  Why didn't you try that?


dougp

QuoteThe fact table in question is a table with about 130.000 records, ... When using this dimension, the number of records increases to 170 million.

After rereading this, I think this is the symptom to focus on.

By any chance, does your date dimension have about 1300 rows?  About 3.6 years?  Of course, working with non-rounded numbers to begin with would help.

I think what you are meaning by N:N is not even that the date table and the fact have a many-to-many relationship.  I think you're saying they have no relationship -- a cross join or cartesian join.  So the result is a dataset with the number of rows being the number or rows from fact times the number of rows from the date dimension.

Creating a proper relationship between the tables will help.  The result should be the number of rows on the fact.  Once you have that, you can start trying to use the dataset to answer questions.