The Australian Public Service Better Practice Guide to Big Data © Commonwealth of Australia 2014 (BPG) gives some good advice on big data management. In last month’s QESP newsletter, I asked whether the government is following its own advice. This month, we look at big data management practices, focussing primarily on the advice given in the BPG.
So, how do we approach big data management? I suggest that a good starting point is to identify characteristics that differentiate big data from conventional data, as well as characteristics of big data projects. What are these characteristics, and what approaches do they encourage? By adopting these approaches, can we avoid management pitfalls?
One of the characteristics of Big Data projects is that they combine software development, business process and scientific research. As scientific research is inherently uncertain, project managers need to manage expectations of the level of certainty. They also need to ensure that the project has a clear business objective. As Tom Davenport states in his Harvard Business Review article ‘What Makes Big Data Projects Succeed’:
Popular wisdom suggests that big data projects are primarily about sifting through a big pile of data to find promising relationships. That is an essential task, but it will be an unproductive fishing expedition unless a company has a business problem in mind.
For the full article, see http://blogs.hbr.org/2014/03/what-makes-big-data-projects-succeed/.
For further reading on the management of scientific research projects, see the ANAO’s ‘Management of Scientific Research and Development Projects in Commonwealth Agencies Better Practice Guide for Senior Management’: http://www.anao.gov.au/~/media/Uploads/Documents/management_of_scientific_research_and_development_projects.pdf
Another repercussion of a culture of research and exploration is the need to able to quickly change direction based on experimental results. This characteristic lends itself to the use of agile project management methodologies. To mitigate the uncertainty of research, we need to be ready to end the project if it does not demonstrate potential by the end of the feasibility stage, as advised in the BPG.
Another characteristic of big data is its use of a wider variety of relevant data. This implies that there are often a wider range of stakeholders, and these are sometimes not immediately apparent. These people may be stakeholders because their data is being used, or because they are impacted by the project outcomes. Significant stakeholders should be identified and engaged early in the project timeline. The BPG recommends that ‘stakeholder engagement plans should include transparent review processes of the data ownership, data acquisition, data management, results, and the application of findings where appropriate’. In addition, the wider range of stakeholders can come from a larger number of business areas. Senior management sponsorship can help cross organisational boundaries and facilitate responsiveness. To fully develop big data, high-value data must be identified and acquired. The BPG recommends agencies ‘develop systematic intelligence on data available, identify high-value data sources and produce and maintain information asset registers’. This is applicable to business as well as government agencies.
Many big data discoveries realise their full potential when applied to improve business processes. As these improvements may impact how workers do their jobs, it is important to factor change management into the project scope. If workers do not see the value of the recommendations, they may ignore them. Benefits may be realised by including affected stakeholders in the change management process from the outset.
A critical concern for big data is privacy. Where personal information is included, the Australian Privacy Act must be considered. The BPG lists the following considerations:
- The collection of personal information from sources other than the individual
- The creation of new data through data analytics that generates enhanced information about a person
- The capacity to compromise anonymity through the collation of a range of data that reveals identity
- The potential for unstructured information sources to hold personal information not known by the individual.
The BPG recommends robust de-identification capabilities, privacy by design and privacy impact assessments. The ADMA’s ‘Best Practice Guideline: Big Data – 2013’ discusses Australian Privacy Principles as they relate to big data, see http://www.adma.com.au/assets/Uploads/Downloads/Big-Data-Best-Practice-Guidelines2.pdf.
Turning to the technical architecture of big data projects, the BPG recommends scalability, extensibility, performance and compatibility be considered in the infrastructure architecture. With the uncertainty of data sizes, it is important to be able to accommodate increasing data stores. Extensibility allows the architecture to be extended without introducing limitations. Requirements for compatibility with existing systems need to be considered. Existing data stores can contribute significantly and their use may be faster to realise. The infrastructure architecture also needs to take into account processing capacity requirements. Where processing demand is uncertain, cloud computing may be an option.
As big data projects become more widely used, management practices will evolve. The BPG is named a ‘Better Practice Guide’, rather than a ‘Best Practice Guide’, in recognition of this evolution. However, the many good practices already identified will go a long way to preventing management pitfalls.
For further reading, see Australian Public Service Better Practice Guide to Big Data © Commonwealth of Australia 2014,
http://www.finance.gov.au/sites/default/files/APS-Better-Practice-Guide-for-Big-Data.pdf
ANNE VERNEY
Anne has over 30 years experience in the ICT industry. She started her career in software development working for Canadian Pacific. Since moving to Australia, she has primarily consulted in infrastructure technology. She has provided services for numerous commercial and government clients, such as CBA, Sydney Water, Promina, IBM, BMC Software, Woolworths, Esso, and CSC. She has an interest in promoting quality in ICT engineering and operational areas. Her work contributed to gaining ISO 9000 accreditation for CSC Australia. She recently provided technology architecture design governance for the billion dollar Commonwealth Bank Core Banking Modernisation Programme.
Contact Anne at anne.verney@gmail.com