This page summarizes the outages and data load issues that have been caused since the March go-live date and steps that have been taken to prevent them from happening again:
ID | Issue | Mitigation task | outages caused by | deployment date / status |
---|---|---|---|---|
564 | A runaway processes possibly initiated by a faulty report, a replication process, or other cause have resulted in the TM1 system slowing way down and preventing the nightly load process from completin | 1. Instal Fix pack 3 for TM1 9.5.2, there are several issues that refer to server crashes in the release notes: 2. Created method to kill any processes that go beyond 1 hour | May 31, 2013 June 20, 2013 July 24, 2013 | All items completed as of 8/28/2013 |
581 | A report could be resulting in a long running process on TM1 resulting in the TM1 system slowing way down and preventing the nightly load process from completing. | Set-up Cognos BI audit logging so report can be identified. | 9/17/2013 | |
574, 580 | Some questions that have come up about system stability relate to data entry errors that we cannot trace because logging was randomly turning off in TM1. | Investigated root cause of auto-logging turning off on some cubes. Fixed TM1 logging. Unable to prove that any data was lost at this time. Please report any issues or concerns as soon as you identify them and we will investigate, starting with the system logs. | 8/28/2013 | |
577 | Data loads failed when CMM had a date that is out of the SQL server minimum date range | Addressed in load process | April 25, 2013 | 9/9/2013 |
578 | Users were expressing that their data disappeared from one day to the next or after an outage. | Verified that save data is happening when TM1 does crash, and that all logging is recalled | 8/28/2013 | |
579 | Users were reporting slowness between 3pm and 5pm | 1. Added 96GB memory to system 8/13/13 bringing total to 192GB 2. Monitored TM1 server for 2 weeks between 3 and 5pm and saw no sign on the server tha it was experiencing any load issues. Users should report this behaviour when it occurs. Documentation of what they were doing at the time would help. | 9/9/2013 | |
n/a | Source system outages may result in CPM load processes being unable to complete. (June 17 iVantage system was not responding) | Please note that there is still risk of data load failures due to issues with source systems or source system data. In the event that the nightly load fails, IT will work with entity Budget Offices to determine whether we will take the system off line during work hours to re-run the load process or wait until the next day. This decision and its impacts will be communicated to cpmparticipants in the event of a load issue. | June 17, 2013 | 9/8/13 |
n/a | Password not changed on SQL server run credential causing a load job not to start | Switch to using non-expiry password | May 13, 2013 May 30, 2013 | May 30, 2013 |
575 | 2 employee id's in iVantage were the same causing a load issue | 1. Change the load to rely on the peid rather than the employee name - More far reaching change than anticipated. Extensive testing needed to deploy change. Preliminary testing complete. Ready for test now - awaiting testing resources. 2. Explore whether load process can proceed with issue (e.g. by ignoring 2nd PEID) just reporting minor errors. | Mrch 26,2013 | Item 1 - Waiting for test |