Mismatch between big data prioritisation and preparedness

Despite a majority of IT department employees (59 per cent) stating that big data is among their organisation’s top five priorities, only 10 per cent have defined an “enterprise-wide big data architecture”.

The findings are from a survey by CSC subsidiary Infochimps, partnering with SSWUG.org. More than 300 IT departments were surveyed for ’Big Data Through the Eyes of Your IT Staff’.

According to CSC, the big data adoption cycle – or the process of fully integrating data-driven insight throughout the organization – has six steps or phases:

  • Phase 1: Big data is explored.
  • Phase 2: Initial use-case deployment is underway or complete.
  • Phase 3: Multiple departments have deployed solutions.
  • Phase 4: Enterprise Hadoop/NoSQL cluster is integrated.
  • Phase 5: Lines of business collaborate to expand big data solutions.
  • Phase 6: Enterprise big data is integral to company-wide operations.

However, one of the key barriers to progressing through these phases is finding the right talent, with 86 per cent citing this as an issue in big data implementation, followed by finding the right tools (77 per cent) and time (74 per cent). According to the McKinsey Group, there will be a projected shortfall of 190,000 data scientists by 2018.

Fifty per cent of organisations said they were still in the data exploration phase with big data, 29 per cent had moved to initial use-case deployment, and only 7 per cent had moved to company-wide deployment of big data.

In a move that may rattle major vendors, 50 per cent of respondents are prioritising open source tools and technologies, with open source and webscale tools ranked twice as high as Oracle and three times as high as SAP when it comes to the big data path technology of choice.

The top requirements of big data solutions are ease of management; speed to deploy and flexible management, ranked equally; and security. Perhaps not surprisingly then, internal data centres were ranked as the preferred deployment option for big data projects, followed by virtual private cloud and public cloud.

Do these results reflect where your organization is in the big data journey? And where do you stand on the open source/proprietary platform question?