Analysis of Hospital Quality Measures and Web-Based Chargemasters, 2019: Cross-sectional Study

Background The federal health care price transparency regulation from 2019 is aimed at bending the health care cost curve by increasing the availability of hospital pricing information for the public. Objective This study aims to examine the associations between publicly reported diagnosis-related group chargemaster prices on the internet and quality measures, process indicators, and patient-reported experience measures. Methods In this cross-sectional study, we collected and analyzed a random 5.02% (212/4221) stratified sample of US hospital prices in 2019 using descriptive statistics and multivariate analysis. Results We found extreme price variation in shoppable services and significantly greater price variation for medical versus surgical services (P=.006). In addition, we found that quality indicators were positively associated with standard charges, such as mortality (β=.929; P<.001) and readmissions (β=.514; P<.001). Other quality indicators, such as the effectiveness of care (β=−.919; P<.001), efficient use of medical imaging (β=−.458; P=.001), and patient recommendation scores (β=−.414; P<.001), were negatively associated with standard charges. Conclusions We found that hospital chargemasters display wide variations in prices for medical services and procedures and match variations in quality measures. Further work is required to investigate 100% of US hospital prices posted publicly on the internet and their relationship with quality measures.


Background
Increases in health care expenditures have persisted throughout the years in the United States despite policy efforts to bend the curve. According to the Centers for Medicare and Medicaid Services (CMS), the US health care spending in 2018 increased 4.6% from the previous year and totaled US $3.6 trillion [1]. A contributing factor to the rise in health care expenditures comes from the fact that hospitals do not compete on price in the same way other efficient product markets do (such as the e-commerce sector). Currently, there are differences in what is charged by health care systems compared with what is paid by consumers [2]. Subsequently, consumers are, in effect, price takers accepting the hospital charges negotiated with their insurer [3].
As a result, historically, consumers have not been as price-sensitive toward making health care decisions when compared with consumer decision-making behaviors commonly observed in other economic sectors. With the continual increase in US health care spending, a widely held view is that greater consumer engagement in health care will help hold prices down. In turn, greater consumer engagement will slow down the sector's expansion rate if (and when) consumers place a substantial emphasis on making price-sensitive decisions using pricing transparency information [4,5]. To that end, the CMS have issued 2 regulations that require hospitals to increase their price transparency [6,7]. The first regulation required hospitals to disclose their diagnosis-related group (DRG) chargemasters on the web publicly in a machine-readable form (such as a Microsoft Excel file) starting in 2019. Releasing the DRG chargemasters on the internet was met with little resistance from hospitals, as the information did not compromise revealing negotiated hospital pricing strategies vis-à-vis third-party payors or competitors. Although there was little resistance to the first federal regulation, previous literature has shown an abundance of nonprice-transparent and noncompliant hospitals and hospitals with inaccessible pricing information [8][9][10].
Nonetheless, understanding newly available US chargemaster information is vital to patients because American patients are sent a medical bill after receiving treatment. A medical bill will contain the patient's portion owed of hospital standard charges for medical services and procedures that were delivered net of any contractual allowances and third-party payments. Therefore, standard charges are relevant to the consumer, either directly by influencing their purchase decisions before receiving medical care or indirectly when they receive a medical bill after treatment.

Objective
This study aims to assess the variability of publicly available DRG chargemaster data and its relation to quality measures, process indicators, and patient experience measures as a source of information for consumer quality assessment and price-sensitive decision-making purposes. The research benefits three audiences. For policy makers, this study provides an early assessment of the pricing transparency regulation's utility. For researchers, being able to collect and compare hospitals' pricing data is an important task if they are to inform policy maker efforts on controlling health care spending. In addition, researchers can inform the public at large and assist other stakeholders, such as nongovernmental organizations, in providing an analysis of pricing information found on chargemasters that is understandable. Finally, for health care administrators, this research can shed new light on the importance of presenting standard charges to the public in compliance with the law.

Procedures
We conducted a cross-sectional study of web-based publicly available hospital chargemasters from 2019. First, we assessed the descriptive statistics and coefficients of variation (CVs) to describe the standard charges grouped by the DRG code. We aimed to describe the full extent of price variability in hospital standard charges.
We then performed 2 median chi-square tests on standard charges and type of service (either medical or surgical). Median chi-square tests were performed because the standard charges were not normally distributed, that is, standard charges were skewed to the right. The first test was for average standard charge (either above the median or below the median) by the type of service (either medical or surgical). Similarly, the second test was for the CV (either above the median or below the median) by the type of service (either medical or surgical).
Next, we performed a log-linear, ordinary least squares regression model to fit the natural log-transformed standard charges on hospital characteristics. We log-transformed the dependent variable (standard charges) owing to the right-skewness and lack of normal distribution. We removed outliers with residuals of IQR 1.5 below the first quartile or IQR 1.5 above the third quartile. β coefficients, P values, and robust SEs were presented as predictors. Robust SEs were clustered on hospital to correct for related observations. All analyses were conducted using Microsoft Excel and Stata/SE 15.1. The institutional review board of the University of Alabama at Birmingham exempted this study.

Data Source
We retrieved chargemasters from hospital websites on the internet between August 25, 2019, and October 3, 2019, if they were formatted using DRG primary codes (eg, chargemasters in Healthcare Common Procedure Coding System or common procedural terminology primary code were excluded). In line with previous research, we obtained common hospital characteristic data from the Hospital Compare, CMS, American Hospital Association (AHA), and Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS).

Sampling Strategy
We constructed a random, stratified sample to assess US hospitals ( Figure 1). It has been previously shown that hospital website quality is associated with HCAHPS recommendation scores [11]. Thus, to ensure an adequate variation of low-to high-quality websites, we stratified hospitals (n=4221) listed in HCAHPS data from October 1, 2017, to September 20, 2018, into 4 ranked quartiles based on the measure, "Patients who reported YES, they would definitely recommend the hospital." In total, 1.26% (53/4221) of hospitals were randomly selected from each of the 4 strata, representing a total of 5.02% (212/4221) of the hospitals in the HCAHPS data set. The sample size was restricted to maintain the feasibility of manual data collection and processing costs [12]. In sum, we had 29,167 observations of standard charges grouped by 81 different hospitals. (HCAHPS) survey from 3rd quarter of 2018. b Some hospitals provided data in an unusable format, such as in the All Patient Refined-diagnosis-related group coding format vs Medicare Severity-diagnosis-related group, providing maximum or minimum charges vs standard charges, etc. DRG: diagnosis-related group.

Predictors for Hospital Characteristics
Quality predictors included benchmark measures for the hospital's overall rating (hospital rating categories: 1 star, 2 stars, 3 stars, 4 stars, 5 stars, and missing), mortality rate, safety score, readmission rate, effectiveness of care score, efficient use of medical imaging score, and patient experience score. These measures and their categorical values (either below the national average, same as the national average, above the national average, or missing) were obtained using the Hospital General Information data set from the CMS. In addition, we included one additional patient experience measure: the likelihood of patients to recommend the hospital using the quartile categories described in the Sampling Strategy section (1=lowest quartile and 2, 3, and 4=highest quartile).

Controls
included hospital ownership type (government-hospital district or authority, government-local, physician, proprietary, voluntary nonprofit-church, voluntary nonprofit-other, and voluntary nonprofit-private). Next, using data from the AHA Annual Survey, the hospital bed size (6-24 beds, 25-49 beds, 50-99 beds, 100-199 beds, 200-299 beds, 300-399 beds, 400-499 beds, and 500 or more beds) was controlled. Previous work has used the number of competitors in the market as a measure of competition (rather than the Herfindahl-Hirschman index) [13]. In line with these studies, we calculated a control measure for competition using the number of Medicare providers per 5-digit ZIP code (1=least and 2 and 3=most) from the HCAHPS data set. The DRG primary code was controlled using individual dummies for each of the DRG primary codes from the CMS data set for Medicare Severity-DRG version 36. Finally, 2 geographical control variables were included using the AHA Annual Survey data set. They were regions (New England, Mid Atlantic, South Atlantic, East North Central, East South Central, West North Central, West South Central, Mountain, and Pacific as defined by the AHA Annual Survey) and US states (individual dummies for each US state).

Variance Analysis
CMS specifically defined 5 services using DRG primary codes to be shoppable in a forthcoming regulation on increasing health care price transparency (effective January 1, 2021). We present the price variability for these 5 shoppable medical services for our 5.02% (212/4221) sample of US hospitals in Table 1. The shoppable medical services included in our analysis were sorted from most to least variable, as measured by the CV. The maximum standard charge was frequently many orders of magnitude higher than the minimum. For example, the standard charge for DRG primary code 473 cervical spinal fusion without comorbid conditions or major comorbid conditions or complications had a mean (SD) value of US $89,302 (SD US $50,122), a CV of 0.561, and ranged from US $30,924 to US $249,283. The maximum standard charge for the procedure was over US $210,000 more than the minimum standard charge out of 44 hospitals that performed the service. b These are the only 5 selected services using the diagnosis-related group primary code (as opposed to the common procedural terminology or Healthcare Common Procedure Coding System) that the Centers for Medicare and Medicaid Services determined to include in the forthcoming regulation (effective date January 1, 2021), which mandates public disclosure of payer-specific negotiated charges, deidentified minimum and maximum negotiated charges, and discounted cash prices for at least 300 shoppable services, including 70 Centers for Medicare and Medicaid Services-specified shoppable services and 230 hospital-selected shoppable services. c DRG: diagnosis-related group. Next, out of the set of 761 DRG primary codes, the data for the most and least variable services by type of service (either medical or surgical) with at least 30 observations are presented in Table 2. Most and least variable services were measured by the highest and lowest CVs, respectively. Noticeably, it appears that surgical procedures had higher means and lower CVs, which is investigated further in the Standard Charge and Type of Service section.

Standard Charge and Type of Service
The relationship between standard charge and type of service (medical or surgical) was assessed for significant differences using 2 different median chi-square tests. The tables are presented in the top and bottom panels of Table 3. In both median chi-square tests, the cells represent the counts of individual DRG primary codes. The median chi-square test for average standard charge versus the type of service was significant (Pearson χ 2 1 [sample size=758]=284.1; P<.001). The observed number of average standard charges was significantly greater than the expected number for surgical services, with average standard charges greater than the median (observed=309 and expected=193). The median chi-square test for CV by type of service was also significant (Pearson χ 2 1 [sample size=758]=7.6; P=.006). However, in contrast to average standard charges, the observed number of CVs was significantly less than the expected number of CVs for surgical services, with CVs greater than the median. In summary, surgical services (as opposed to medical services) generally tended to have significantly more average standard charges and fewer CVs above the median.

Hospital Characteristics of Standard Charges
We examined standard charges across hospital characteristics: hospital ownership, hospital rating, mortality, safety, readmission, effectiveness of care, patient experience, competition, efficient use of medical imaging, patient recommendation, region, bed size, US state, and DRG primary code (Table 4). Using multivariate regression modeling after removing outliers, we found that our model was able to explain nearly 90% of the variation in the randomized, stratified sample of standard charges in 2019 using categorical variables for the predictors (Table 5).  All quality indicators were associated with standard charges at the statistically significant α=.05 level, except for the patient safety indicator. The 2 quality indicators associated with the largest significant increases in standard charges were below the national average mortality rate (β=.929; P<.001) and below the national average readmission rate (β=.514; P<.001); they were associated with 153% and 67% significantly higher standard charges on average, respectively, compared with the national average groups, holding other factors constant. On the contrary, the three quality indicators associated with the largest significant decreases in standard charges in our study were above the national average effectiveness of care (β=−.919; P<.001), above the national average efficient use of medical imaging (β=−.458; P=.001), and the highest quartile patient recommendation scores (β=−.414; P<.001); they were associated with 60%, 37%, and 34% significantly lower standard charges on average, respectively, than those of the reference groups, holding other factors constant.
Finally, for Table 5, please note that the interpretations of β coefficients were on average, while holding all else constant and using natural log-transformed standard charges as the outcome variable. In addition, the constant and β coefficients for the missing categories in the relevant variables were not described but can be found in the table. Robust SEs were clustered on hospital to correct for related observations. Outliers were removed as described in the Methods section, leaving 27,530 observations in the regression model. Overall, a large amount of variation in standard charges was explained by our regression model (R 2 =89.55%).

Principal Findings
Wide differences exist between hospital billed charges and the amount of money that hospitals expect to receive for services [14]. Our analysis found that chargemaster DRG prices on the internet varied greatly between facilities. At a minimum, the web-based chargemaster data do not reflect the marginal cost of performing 1 instance of a procedure. Different hospitals have widely varying fixed costs that may drive the variance to some extent, but this is not sufficient to explain the differences observed [15]. A more plausible explanation is that there are systematic differences in the business strategies related to chargemaster construction, as found in our analysis.
Reviewing Table 2, even the procedures with relatively low SDs and CVs had wide enough ranges to indicate that there is little to no relation to the chargemaster's rates and actual underlying costs. For example, a previous study on Ohio state data from 2007 to 2012 showed that a hospital with the highest median charge for a normal newborn delivery (DRG primary code: 795) could be nearly 4 times as costly as the hospital with the lowest median charge despite no differences in length of stay (which typically is 2 days) [16]. We found even further drastic differences in our data set of the standard charge of a normal newborn delivery on a national level when compared with this study, where the maximum standard charge for the procedure was more than 1250 times greater than the minimum standard charge. Furthermore, our finding for the Normal Delivery of a Newborn service having the largest variation among our data was unusual for three reasons. First, the mean standard charge was relatively small, which usually leads to lower variances. Next, the upper bound of US $1,268,646 defied any reasonable expectations for this service. Finally, the minimum rate, US $1005, also defied logic. Even an uncomplicated delivery typically involves a 2-day stay with a per diem above US $1400, which would total more than US $2800 for the charge [13]. Additional examples of standard charges with wide ranges for the exact same service are commonly found in the literature [17][18][19].
Thereafter, we sought to test the differences in variability in the type of service (either medical or surgical). The estimated CVs for surgical-type DRGs were significantly smaller than those for medical DRGs using standard charge data for Maryland between 1979 and 1981 [20]. However, as the DRG patient diagnosis classifications are refined overtime, variation among medical-type DRGs could potentially converge toward the more favorable lower levels of variation of surgical-type DRGs [20]. However, we found that after 4 decades of revisions to DRG codes, where the number of unique codes increased from ≥400 in the 1980s to ≥700 in the 2020s, medical-type DRGs still had more variability than surgical-type DRGs. Our results may indicate challenges, as the results show that it is still increasingly more difficult to predict medical-type standard charges that have more variability when compared with surgical services. As a result, health care providers and other stakeholders will have to work increasingly harder to assist consumers in making informed decisions, especially for medical services.
Afterward, we sought to understand whether the wide variances observed were systematically related to hospital characteristics for quality performance indicators. A number of hospital characteristics were shown to be significantly associated with standard charges, including physical characteristics such as bed size or ownership structure, geographical characteristics, controls for the service or procedure code, competition, and quality indicators (such as patient recommendation scores or readmission rates).
Overall, our results were largely consistent with those of a previous study that found that standard charges in hospital chargemasters were well predicted using hospital characteristics [21]. However, the previous study did not find sufficient evidence that hospitals with higher prices also provided a higher quality of care [21]. In contrast to this finding, we found 2 key quality characteristics to be positively and significantly associated with standard charges (when controlling for market competition, physical characteristics, geographical differences, and DRG primary code in the multivariate analysis): mortality rates and readmission rates. Furthermore, these quality indicators are consistent with economic theory, where higher quality goods and services demand a higher price in the competitive market [22].
On the other hand, there is not a singular positive or negative relationship between price and quality, and at times, price and quality can either have a positive or negative relationship [23]. We found 3 health care quality indicators with contradictory results to standard economic theory: effectiveness of care, efficient use of medical imaging, and patient recommendation scores. In other words, as quality increases, the standard charge decreases, which is a contradictory pricing behavior.
Complexities exist in modern health care, which causes gaps in the ability of health care systems to deliver consistent, effective, and efficient care [24]. Therefore, significant undertreatment and overtreatment occur [24]. A possible explanation for the relationship between higher quality effectiveness of care and efficient use of medical imaging being associated with decreases in standard charges is that they lower waste, and thus, they reduce standard charges. Finally, a potential reason for higher patient recommendation scores being associated with lower standard charges is that the hospital may benefit from increased volume (or demand owing to more patient referrals) and, in turn, from economies of scale.
At this juncture, it is important to digress from the first phase of health price transparency regulation and discuss the implications of the second phase briefly to shed some light on other implications of this study in the context of present health care systems and policies. Although hospitals provide chargemaster data, the standard charges rarely provide information for patients to make informed health care decisions [25]. As a large number of patients are insured, they are more interested in cost sharing information and specific insurer-negotiated pricing rather than standard charges for health care services. Therefore, the second round of CMS transparency regulations more broadly requires hospitals to disclose the rates they have negotiated with third-party payers for service bundles starting in 2021 [7], including the following: • Gross charge: the charge for an individual item or service that is reflected on a hospital's chargemaster, absent any discounts • Discounted cash price: the charge that applies to an individual who pays cash or cash equivalent for a hospital item or service • Payer-specific negotiated charge: the charge that a hospital has negotiated with a third-party payer for an item or service • Deidentified minimum negotiated charges: the lowest charge that a hospital has negotiated with all third-party payers for an item or service • Deidentified maximum negotiated charges: the highest charge that a hospital has negotiated with all third-party payers for an item or service.
Patients may use this additional information in 2021 to more accurately price-shop, insurers may use this information to bargain for better reimbursement rates, and other facilities may use this information to alter their pricing strategies and compete more effectively in the more transparent health care market. Therefore, this information is closely guarded by health plans [26]. Thus, hospitals, insurers, lobbying groups, and other stakeholders oppose this regulation because the negotiated prices have immeasurable proprietary strategic value, and disclosure thereof will have far-reaching implications on both price and quality competition. It remains to be seen if health systems will be able to block or alter the second phase of price transparency regulations before the scheduled implementation in 2021.

Limitations
While conducting this study on health care price transparency, there are 2 important limitations that need to be discussed. First, we did not analyze pricing information from other coding systems, such as common procedural terminology, Healthcare Common Procedure Coding System, or other proprietary formats. Some hospitals published chargemasters using other codes that were not mandated. Thus, the study results can only be generalized to the extent that DRG codes bundle services together correctly and correspond accurately to services rendered for patients. Some of these other coding systems rely on billing specialists to itemize services rendered, and they may or may not result in more accurate pricing, which could be higher or lower on average when compared with DRG-coded charges we analyzed in this study. However, the DRG coding system is one of the most widely used systems for preparing patient bills, and the results of this study are directly applicable to this most commonly used hospital pricing system in the United States.
Second, we did not follow up, investigate, or verify individual observations of standard charges. It is possible (and quite likely) that hospital chargemasters unintentionally contain outdated, erroneous, or inaccurate standard charges. These mistakes may have been published on the web for the public unbeknownst to hospital administrators. We mitigated these effects as much as possible by using statistical techniques where appropriate, such as analyzing median values and removing outliers.

Conclusions
Patients are not solely influenced by costs when making health care decisions; they base their decisions on several factors, including the opinions and information supplied by their health care providers and insurers. Moreover, previous literature has shown that patients just do not want to be a cog in the health care system, but in reality, they want to share in the decision-making processes regarding where to seek treatment with their health care providers [27,28]. Such health care-related decisions are commonly determined based on the quality of available medicals goods and services at a particular facility or by a specific provider. Therefore, patient decisions to seek treatment are being determined jointly by providers and consumers using both clinical quality and out-of-pocket cost information.
In summary, the results of this cross-sectional study, which analyzed the pricing behavior at hospitals in the first phase of the price transparency regulations, draw attention to the fact that policy makers, researchers, and health care administrators as well as, ultimately, consumers all need to be vigilant about health care price transparency and its relation to quality measures. There was extreme variation in shoppable services. Findings unearthed in this study include: one of the most commonly performed services (normal newborn delivery) had the most variation, significantly larger variation existed in medical services than surgical services, and quality variables were associated either positively or negatively with standard charges. It is ever more important for all the parties involved, such as researchers, policy makers, and health care administrators, to act in good faith and make the information as user-friendly and accessible as possible as well as use this information to the highest, fullest potential-bending the health care cost curve.

Future Directions
It is crucial for researchers, policy makers, and health care administrators to work together to design a holistic registry or database system to document these chargemasters. This study has demonstrated the potential value of such information using publicly available chargemaster data on the internet from a cross-sectional random, stratified sample of 5.02% of the US hospitals. This process can be scaled up to collect, clean, and document chargemasters for all US hospitals multiple times per year, such as quarterly or semiannually.