Could text analytics be the cutting-edge technology the oil and gas industry was waiting for?



Small causes can have large effects; or how a discovery in the Barnett Shale can spike some interest in the rest of the world and change the face of the industry.

This article is co-written by Sylvie Jacquet-Faucillon, Senior Analytics Presales Consultant, SAS France; and David Dozoul, Senior Adviser for Oil and Gas, SAS Global Energy Practice

Urgent need for digital transformation in the oil and gas industry

Demand for oil and natural gas is constantly growing, and the success of providers in meeting this demand is driven by both technological expertise and innovation capabilities. Innovation is also at the core of SAS.

With the recent turmoil in the energy industry, oil and gas companies must overcome some key challenges – price declines, strong competition, and resource replacement – while dealing with geopolitical contexts. The industry, therefore, needs to embrace new cutting-edge technologies to remain competitive and profitable.

Exploration and production projects are always large-scale, expensive and complex undertakings with long timelines. Collaboration drives innovation in the oil and gas sector to share exploratory costs and to manage risks. One of the biggest challenges is to define the exploration strategy. Applications extend to finding the best partners, identifying new trends (such as unconventional gas), and benchmarking active companies and their stakes. To stay ahead of competition and ahead of new ventures, organizations not only need to monitor trends, but also identify weak signals such as market movers and new exploration trends. It’s time for the oil and gas sector to address the big data challenge and exploit the wealth of unstructured data available: industry and government reports, local knowledge, geological studies, digital media, and so much more.

Currently, most companies manually monitor several petroleum announcements from online news providers. The volume and diversity of data sources available to collect, treat, analyze and process require an extensive number of resources (people and time) to process the information with traditional methods, and leaves little time to enhance and interpret the results for better decision making. But with text analytics, a new approach is now available to automate, optimize and industrialize these time-consuming manual tasks.

From oil refining to data refining: Text analytics is the answer

To leave no opportunity unexplored and manage the risk of the investment, the answer is a strong competitive strategy based on data-driven analytics. One innovative and differentiating component of this strategy should be SAS® Text Analytics, which automatically processes and analyzes large amounts of multiple text data sources to create relevant and accurate information for all upstream activities. The two key components used are:

  •  SAS Crawler, which automatically crawls all relevant key online news providers, including secured sites.
  • SAS Contextual Analysis, which combines the benefits of automated natural language processing and machine learning, enriched with human subject-matter expertise.

Methodology workflow


SAS Text Analytics Methodology

With SAS Contextual Analysis, oil and gas providers can easily customize the default taxonomies, entities and concepts provided with the software to better address the requirements of their specific industry. Some examples are:

  • Extraction of concepts that gather critical business concepts from news feeds (company, basin, block, countries and well names) and references to oil and gas terms like stakes, hydrocarbon types, volume of the discovery, height of the gas column, water depth, etc.
    Company B has confirmed a gas-cond discovery in XX block, Water Depth 38m. Discovery resources are pegged at 250-350 MMboe in-place, and the well tested 10.6 MMcfg/d and assoc. cond. from a 240m hc column in the L. Cretaceous pre-salt…”
    The transfers of participations or awarded winners in a bid process could be automatically detected by extracting facts and events from text:
    Company C has agreed to acquire Company D's 13.058% interest in in block C in exchange for $15,000 cash payment (US$10,814) …”
  • Categorization of news among several business categories: discoveries, negative drilling results (subeconomic, dry well), changes in permits (join ventures, mergers and acquisitions, farm-in, farm-out, award). How does this work? We either define linguistic rules to classify documents or leverage the SAS Text Analytics rules generator capability. Our unique rules generator enables “active learning” from already-classified news (some petroleum news providers deliver news with some metadata) and generates the associated linguistic rules.
  • Discover the unexpected: SAS algorithms drive automatic topic discovery, so you could go much further in analysis by uncovering upcoming trends, new topics or weak signals.
  • SAS also provides the ability to integrate oil and gas internal and external repositories (for instance: wells, blocks, companies and basins definitions), remove the duplicates, and clean the information to get consistent data, which is a main prerequisite to obtain reliable results. The data quality step will ensure the proper data enrichment and eliminates the time-consuming tasks to match journalistic type of information with industry-confirmed information.
  • Exploration, reporting and industrialization: As an integrated solution, SAS enables processing automation from data crawling to text analytics and data visualization. Time and resources are saved by automating the time-consuming tasks of reading news, manually classifying it and building dedicated reports. Subject-matter experts can focus on detailed analysis and extend coverage of investigation to monitor more companies or additional topics (planed wells, seismic information, bid round analysis).

Did you say ‘challenges’?

The software also addresses common challenges of text analytics projects that are not necessarily specific to the oil and gas industry:

  • High volume of rules: Oil and gas references to blocks, basins, wells and companies (with subcompanies) result in several million classifier rules that are updated weekly in SAS Contextual Analysis.
  • Advanced disambiguation process with accurate concept rules: What happens when the basin name is also a country name? When an oil and gas company name is also a usual business term? How do you map with geospatial data and get an accurate data visualization if your contextual extraction is not quite perfect?
  • Advanced predicate rules extract key and relevant information such as proper stakes changes, awarded companies and bid runners from articles.
  • A strong data management and data quality foundation is needed to match and link entities (wells, companies, blocks, basins) to each other, even though each journalistic source has its own way to write the same information (e.g., a well name differs slightly between two sources: ‘-‘ replaced by ‘_ ‘ or by a space).
  • Strong subject-matter expertise integration through accurate and manageable rules.

Text analytics in oil and gas: The art of the possible

The digital transformation makes it mandatory for the leaders in the oil and gas industry to compete on advanced analytics and data management proficiency. Oil and gas companies can take advantage of SAS software’s advanced text analytics, data management and data visualization capabilities to identify useful insights that can be used to create better outcomes through smarter decisions.

The stakes are huge with current oil prices. The cost of exploration with standard discovery rates pushed the industry to reinvent itself and identify more agile and smarter ways to select projects in which to invest. Indeed, learning from experience is a golden rule that most of us are accustomed to in the industry. Actually, learning from competitors’ experiences, easily monitoring crucial trends in the market, and cross-checking that information against internal expertise enables organizations to better understand important market movements and identify opportunities ahead of the competition.

Opportunities can come in different ways, such as deciding to take stakes in a promising exploration block, taking over a competitor or exploring new reservoir types. Failure to identify these opportunities in time often leads to shifts in market shares, delays to first oil or in uncertain reserve replacement.

Text analytics capabilities can also address a wide range of other applications in the oil and gas sector, in addition to the competitive intelligence case studies discussed in this article:

  • Patent analysis.
  • Operational and maintenance optimization.
  • Warranty claims analysis, root cause analysis on logs, call center notes.
  • Consumer sentiment analysis: Learn how the customers felt about the products they use (call centers, survey feedbacks, online forums).
  • Procurement analysis (bids, competition, supply chain contracts).
  • Health safety environment reports analysis.

Should you have any questions, feel free to contact us at or


About Author

Sylvie Faucillon


As Principal Advisor in Analytics, Sylvie Faucillon helps clients accelerate their digital transformation by leveraging Data and Analytics with SAS and co-creating a solution that far exceeds their expectations. Her 15-year journey at SAS, has earned her a reputation of being creative, passionate while cultivating a strong customer focus. She is the catalyst who has led, developed, and executed the SAS SWEE expertise on NLP, Text Analytics and Conversational Agents across the board: customer advisory, delivery, training, public forums, marketing events and for a wide range of industries. She is the author of several blogs and articles. Before joining SAS Institute, she worked as an R&D expert at SAP Business Objects. She holds a European engineering diploma in Computer Sciences and Cognitive Sciences.

1 Comment

Back to Top