The database contains the following contents:
Total size: 204.0Gb Total records: 1 148327940 The database contains the following files:
Index: Aggregate data and event data.
Type: Add to Shopping Cart, Configuration, Dashboard, Index Mode, More Improvements, Order, Delete from Shopping Cart, Search, Server.
E-mail displayed by sampling, searching and querying production records that reveal visitor ID, session ID and equipment information (such as iPhone, Android, iPad, etc.). ) may be the target of social engineering phishing attacks, or may be used to cross-reference other operations. These files clearly understand the configuration settings, the location where the data is stored, and the blueprint of how the logging service runs from the back end.
CVSHealth acted quickly and professionally to protect the data. A member of their information security team contacted me the next day and confirmed my findings. The data was indeed theirs. I was told that this is the contractor or supplier who manages the data set on behalf of CVSHealth, but who is the supplier is confidential.
The exposed records are marked as "production". When searching for potentially identifiable information, we will search for common email extensions, such as Gmail, Hotmail and Yahoo. Each query in the dataset has a result indicating that the record contains an e-mail address. As we all know, many personal e-mail addresses are formatted with some or all user names. In addition, I can identify a small number of people by searching for public email addresses on Google.
The record also contains "Guest ID" and "Meeting ID". I have seen many records showing that tourists are searching for a series of items, including medicines, COVID-19 vaccines and other CVS products. Suppose you can match the session ID with what they searched or added to the shopping cart during the session, and then try to identify the customer through the exposed email.
According to the representative of CVS, these emails are not from CVS's customer account records, but are entered by visitors themselves in the search bar. The search bar captures and records all the contents entered into the website search function, and these records are stored as log files.
When viewing the mobile version of CVS website, there is a possible theory that visitors may think that they are logged into their account, but in fact they have entered their email address in the search field. Search is formatted as an event type parameter and set to search, and the e-mail address is the value of a parameter named Query. This can explain why so many e-mail addresses appear in the product search database, which is not used to identify visitors. The record also shows what equipment was used. Most of the searches I see come from mobile phones and mobile devices, but there are also desktop computers.
CVSHealth provided us with the following statement:
"Thank you again for contacting us. We were able to contact our suppliers and they took immediate action to delete the database. Protecting the private information of our customers and companies is a top priority. It should be noted that this database does not contain any personal information of our customers, members or patients. "
Activity log: A necessary tool to track all activities on a website or e-commerce platform, which helps to build valuable insights about visitors and customers. This logging and tracing usually contains metadata or error logs, which inadvertently exposes more sensitive records. In this example, these are the search logs of all the content searched by visitors, including references to CVSHealth and CVS. Universal domain name format. This will provide valuable analytical data to understand what customers are looking for and whether they have found the products they want.
Login systems use mixed alphanumeric visitor ids to ensure that shoppers are anonymous. It should be noted that the database does not collect personal data of visitors or email addresses of shopping carts. Unfortunately, only human error can be blamed on website visitors, who publicly exposed the misconfiguration of the database and entered their email addresses in the search field. I suggested to CVS that in the future, they should prevent any search that matches the email address pattern or domain name from being executed or recorded. This helps to avoid collecting or storing unnecessary data.
The guest ID and the session ID themselves do not contain identifiable data, and it is possible to identify users only when they are used in combination with e-mail addresses. Theoretically, the search will still create a "session ID", which may not change during the visit, and the email can be linked with the "session ID". This exposure may have identified an unknown number of conversations. Users add emails to the search bar and then continue to perform other operations, such as; Search history and products added or deleted from online shopping carts. The session ID with email is unique and the timestamps are not consecutive, which indicates that these are unlikely to be automatic search queries.
We will not download or extract the publicly accessible data we find, but only take a limited number of screen shots for verification. It is always a race against time to help protect the exposed data before it is used or cleared by ransomware. We can't review all 1 1 100 million records because we urgently need to report this exposure responsibly and the speed at which CVS suppliers restrict public access. We can only look at a limited sample of records, not the whole data set.
Other risks considered
When any database is exposed, you may see configuration, application, software, operating system and build information, which may identify unpatched or outdated potential vulnerabilities. Cybercriminals and nation-states use sophisticated methods to collect and use the data they find. They usually use the same methods as legitimate security researchers to identify publicly exposed data. Although we try to protect data every day, we find that cyber criminals try to use these data to achieve evil goals. Each information record is a puzzle, which provides a larger picture of an organization's network or data storage method.
According to Wikipedia: CVSHealth is an American health care company, which owns CVSPharmacy and retail chain pharmacies; CVSCaremark, pharmacy welfare manager; Health insurance provider Antai and many other brands. The company is headquartered in Woonsocket, Rhode Island.
We have not implied any misconduct by CVSHealth, its contractors or suppliers. Nor are we saying that customers, members, patients or website visitors are at risk. The theory expressed here is based on the hypothetical possibility of how to use these data. We emphasize that our discovery is only to raise awareness of network security and make people realize that simple things, such as search logs and misconfigured databases, may capture and expose data.