Data Science Products and Services Website

Tableau Visualizations

Visualize Your Data | Create Dashboards | Visit Store Today | Contact Us

Blog Subscription

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Social Security and Medicare Trust Funds Deficit Concern Project

Play Project Theme Song

The first “Hypothetical Data Analysis Projects” Portfolio is the Social Security and Medicare Trust Funds Deficit Concern Project or Project 1.

The Social Security and Medicare Trust Funds Deficit Concern Project’s Hypothesis Research Statement (H1) is the following:  The discovered insights are intended to clarify the role that the Social Security and Medicare Trust Funds play in the Social Security and Medicare Trust Funds deficit problem and the U.S. Federal Government’s deficit problem, using statistics, correlations, forecasting, and prediction for exploration and clarification of the Social Security and Medicare Trust Funds Deficit Concern statistical economics domain data set.

The statistical machine learning analysis techniques that are demonstrated in this project are Tableau 2019.1, SAS® Enterprise Guide 7.1, and SAS® Enterprise Miner 14.3.

Business Area

The Business Area of the economics domain is the U.S. Social Security and Medicare Trust Funds which involves the following Federal organizations:

  • The Social Security Administration (SSA),
  • The Centers for Medicare and Medicaid Services (CMS),
  • The American Medical Association (AMA),
  • The Bureau of Labor Statistics (BLS), and
  • The Internal Revenue Service (IRS). 

Business Understanding

The U.S. Federal
Government’s Economics Domain includes data about the following two Trust
Funds:

  1. The Social Security Trust Fund Program Statistics Data.
  2. The Medicare Trust Fund Program Statistics Data.

Organization

Two U.S. Federal Government’s Trust Fund accounts or programs created by the U.S. Treasury Department:

  1. The U.S. Federal Government Social Security Trust Fund / Operated by the Office of the Chief Actuary.
  2. The U.S. Federal Government Medicare Trust Fund / Operated by the Department of Health and Human Services.

Stakeholders

The Social Security and Medicare Trust Fund Stakeholders include the following:

  • The U.S. Federal Government.
  • The U.S. Public.

Problem Description

Although the Social Security and Medicare Trust Funds are a part of the U.S. Federal Government’s Economic system and is regulated by the Office of the Chief Actuary, the complex economic deficit problem still exists which the Annual Trustees Report provides economic status and projection information regarding the Trust Funds.

The Annual Trustees Report includes information about the Trust Funds’ deficit and forecasted depletion problem which involves complex economic data and might be difficult for the average U.S. Citizen to understand without prior research or knowledge about the subject matter’s domain.

However, the U.S. Federal Government has public data sources available on the Internet that provides historical economic statistics data about Social Security Payments, Medicare Payments, Population, Income, Expenditures, Employment, Forecast Benefits, and Forecast Unemployment Rates.  The U.S. Department of Health & Human Services, ASPE, Office of the Assistant Secretary for Planning and Evaluation (2005) clearly states that resources are required to pay future Social Security and Medicare Trust Fund benefits which is derived from the economy. 

According to Thwink.org (2014), a difficult problem of large size and complexity and that has existed over a long duration of time due to various cause and effect behavior is only solved by an analytical approach to achieve a solution that is reliable opposed to an intuitive approach that is mostly used to solve a simple or an easy problem that results in a superficial solution for a difficult problem.  Also, the analytical approach requires a structure of repeatable processes that breaks the problem down into many elements to achieve the solution.  Furthermore, the analytical approach was derived from the traditional scientific method that is based on proof and rationale to achieve reliable solutions.

The project’s selected data sets include historical data that has been clarified using the analytical approach.  However, more information about systems and systems’ analysis might help to clarify the problem’s analytical approach data analysis approach selection.

Heylighen and Joslyn (1992, para. 1-3) states that the systems theory was proposed by Ludwing von Bertalaffy, Biologist, in the 1940s, and furthered by Ross Ashby which the concept focuses on the relationships between organized parts and consists of “system-environment boundary, input, output, process, state, hierarchy, goal-directedness, and information” which the system’s qualitative information can be used to help a system to evolve whereas systems analysis involves using principles along with “objectives, constraints and resources” for decision-making “to specify possible courses of action, together with their risks, costs and benefits” to solve problems with complex systems. The systems theory was derived as a reaction against reductionism which the “unity of science” focuses on the entire entity or holism rather than reducing the various parts or elements to their properties or reductionism.

Heylighen (1997, 1993, para. 1-2) states that classical science is linear and the system processes are based on past factors which traditional system models use cause and effect to determine the system states, which is why the a Linear Regression Prediction Model was most appropriate for the historical economic dataset that includes continuous statistical data as you will read about later in the portfolio.

Background

The Social Security and Medicare Trust Fund Deficit issue is an ongoing complex problem for the Federal Government and for the U.S. Public.  The Components of Trust Fund’s Income and Outgo (n.d.) lists the Trust Funds’ components as the following:  1.) Employment Taxes (i.e., taxable wages income tax), 2.) Expenditures (a.k.a., outgo/expenses), 2.) Surpluses (i.e., positive balances), 3.) Deficits (i.e., negative balances), and 4.) Asset Reserves (i.e., surpluses – deficits)).

For decades, the projected or forecasted future Social Security and Medicare Trust Funds deficit problem has been getting national attention from the United States (U.S.) Federal Government and the U.S. Public due to factors such as rising costs of benefits and expenditures.  The Social Security & Medicare Tax Data (n.d.) Web Article states that the Social Security and Medicare’s Hospital Insurance program (a.k.a., Social Security’s Old-Age and Survivors Insurance (OASI) and Disability Insurance (DI) Trust Fund, and Medicare’s Hospital Insurance (HI) Trust Fund) is mainly funded by employee and employer income tax wages (i.e., FICA and SECA initial estimated basis then adjusted to include the actual data).

Medicare.gov (2019), describes Medicare costs at a glance:  Medicare Part A (Hospital Insurance) costs, Medicare Part B (Medical Insurance) costs, Medicare Part C (Medicare Advantage), and Medicare Part D (Medicare prescription drug coverage).  The Contribution and Benefit Base (n.d.) Web Article states that the OASDI program’s annual limit for employers’ and employees’ earnings taxation for contribution and benefit (i.e., benefits computations) base or taxable maximum is set annually by a statute.  The maximum limit statue amount is subject to change annually based on the National Average Wage Index (URL:https://www.ssa.gov/oact/COLA/AWI.html).  The OASDI tax rate is 6.2% for employers and employees and 12.4% for self-employed employment income tax.  After 1993, there has been no HI annual maximum limit statute for taxable earnings for employers and employees.  The HI tax rate is 1.45% for employers and employees and 2.90% self-employed employment income tax.  The following demonstrates how the taxable income contribution and benefit base or maximum limit has risen between 1959-2019:

  • 1959-65
    Maximum Tax Limit $4,800
  • 2016
    Maximum Tax Limit $118,500
  • 2017
    Maximum Tax Limit $127,200
  • 2018
    Maximum Tax Limit $128,400
  • 2019
    Maximum Tax Limit $132,900

A summary of the 2018 Annual Reports, Social
Security and Medicare Boards of Trustees (n.d.) provided a summary message to
the public about the economic state of the Social Security and Medicare Trust
Funds.  In the 2017 fiscal year the Trust
Funds equaled 42% of the Federal program expenditures and the cost growth is
expected to rise in the mid 2030’s due to generational changes in the
population such as the aging baby boomer generation population “entering
retirement” and the lower-birth-rates generation population “entering employment”
(para. 3).  The law makers are working on
long term financing shortfalls to provide a broad range of solutions which an
expedited solution will be a better solution for vulnerable populations such as
people that are dependent on benefits and those who earn lower incomes.

According to ASPE (2005), the 2005 Social Security and Medicare Trust Fund Annual Trustees Report stated that the Social Security and Medicare Trust Fund deficit effects the budget from an economics perspective.  For example, in 2005 government receipts equal 17% of the gross domestic product, GDP, and expenditures equal 20% of the GDP which the two programs absorb 7% of the GDP and expenditures are forecasted (5% less than or exceeding over the 75 year estimation time) to grow to 13% by 2030 and 15.5% by 2050.  If a long-term solution is not met, the growing rates of consumption would equal about 80% of expenditures and most of the government’s receipts would be consumed by 2050.  Also, from a Trust Fund perspective or budgetary perspective, the reserves for both programs are the following:

  • The
    Social Security projected or forecasted reserves will pay in full until 2014
    and its income will account for 14% of the average deficit for 75 years.
  • The
    Medicare Hospital Insurance, HI, projected or forecasted reserves will pay in
    full until 2020 and its income will account for 91% of the average deficit for
    75 years.  Also, Medicare Supplementary
    Medical Insurance, SMI, expenditures are paid in full each year by the
    Treasury’s general fund finances which 3/4 of SMI’s income is derived from enrollee
    premiums.

The 2005 Trustees Report also stated that it appears that a fix for the program’s short falls would be to immediately raise the payroll tax rate from 15.3 percent to 20.31 percent (i.e., raise FICA payment rates) or equally reduce program expenditures, which program endangerment could start in 2020.  The combined income tax rate average for the programs would be a 33% increase or a 22% combined average benefit reduction.

The Holden PLLC (2018) online Article reported on October 30, 2018 that Social Security benefits increased 2.18% in 2019 for specific higher income brackets and basic Medicare Part B premium increased to $135.50 in 2019 for specific higher income brackets.

Robust Data Set Description

30 Variables and 69271 Rows

The final data set consists of two combined Excel Worksheets.  The selected data sets contain quantitative statistical data about the Social Security and Medicare Trust Funds’ beneficiaries and expenditures along with various U.S. Federal Government information about the population, income tax, and GDP.

  1. Worksheet: 
    2001 – 2016 SS Health Population Employment Income GDP
  2. Worksheet: 
    2016 MEDICARE HOSPITAL CLAIMS

As I was gathering the relevant statistics data sets, I found that some online data sets included seasonally adjusted data and non-seasonally adjusted data.The United States Department of Labor, Bureau of Labor Statistics, Labor Force Statistics from the Current Population Survey, What is Seasonal Adjustment (n.d.) states that seasonally adjusted data is used to compare months of statistics data and that non-seasonally adjusted data is used to calculate the estimated annual average series of statistics data.  Also, a year’s seasonal events such as “changes in weather, harvests, major holidays, and school schedules” (para. 2) causes fluctuations in the following statistics data criteria:  Labor Force Size, Employment and Unemployment Levels, and Other Levels of Labor Market Activity.  Therefore, it is a general rule that monthly reports to the news about employment and unemployment include the seasonally adjusted statistics data which the technique is used “to measure and remove the influences of predictable seasonal patterns” (para. 1) or to remove seasonal movements (month to month changes) in a series of data to “make it easier observe the cyclical, underlying trend, and other non-seasonal movements in the series” (para. 2).  For example, the Average Annual Unemployed Above Age 16 data set variable includes seasonally adjusted statistics data.

The United States Department of Labor (n.d.) describes the employment labor force as non-institutional civilians who are employed and describes the unemployed labor force as persons that don’t have a job, has looked for work in the last four weeks, or is temporarily laid off and who are available to work. The bulk of the final data set’s observations is the result of including the Hospital Compare’s data.  Medicare Hospital Spending by Claim (n.d.) provides CMS Hospital Compare data which the data is about the price-standardized and non-risk-adjusted average spending level values for Part A and Part B Medicare Spending per Beneficiary (MSPB) hospital episodes by Medicare Claim type “from 3 days prior to an inpatient hospital admission through 30 days after discharge” (para. 1).

30 ECONOMIC COMPONENT VARIABLES

  1. ANNUAL
    TIMEID
  2. AVG ANNUAL EMPLOYED 16 ABOVE(T)
    INTERVAL
    BLS, Bureau of Labor Statistics
    URL:  https://data.bls.gov
  3. AVG ANNUAL UNEMPLOYED 16 ABOVE SEASADJ(T)
    INTERVAL
    BLS, Bureau of Labor Statistics
    URL:  https://data.bls.gov
  4. AVG ANNUAL UNEMPLOYMENT RATE
    INTERVAL
    BLS, Bureau of Labor Statistics
    URL:  https://data.bls.gov
  5. AVG INCOME TAX WHOLE$
    INTERVAL
    IRS, Internal Revenue Service
    URL:  https://www.irs.gov/statistics/soi-tax-stats-historical-table-3
  6. ECONOMIC STATE STATUS(M)
    INTERVAL
    Transformed Federal_Deficit(M)
  7. FEDERAL DEFICIT(M)
    INTERVAL
    US Spending
    URL: https://www.usgovernmentdebt.us/
  8. FEDERAL DEFICIT % GDP(M)
    INTERVAL
    US Spending
    URL: https://www.usgovernmentdebt.us/
  9. Federal Final Consumption Expenditures
    INTERVAL
    Databank, Worldbank
    URL:  https://data.worldbank.org/indicator/NE.CON.GOVT.CD
  10. Gdp Us(B)
    INTERVAL
    Dababank, Worldbank
    URL:  https://data.worldbank.org/indicator/NY.GDP.MKTP.CD?end=2017&name_desc=false&start=1960&view=chart
  11. Health Care-Fed % GDP(M)
    INTERVAL
    URL:  https://www.usgovernmentdebt.us/
  12. Health Care Fed(M)
    INTERVAL
    URL:  https://www.usgovernmentdebt.us/
  13. Income Tax % AGI
    INTERVAL
    IRS, Internal Revenue Service
    URL:  https://www.irs.gov/statistics/soi-tax-stats-historical-table-3
  14. Nation Hospital Avg Spending Per Episode
    INTERVAL
    URL:  https://data.medicare.gov/Hospital-Compare/Medicare-Hospital-Spending-by-Claim/nrth-mfg3
  15. Nation Hospital Percent of Spending Per Episode
    INTERVAL
    URL:  https://data.medicare.gov/Hospital-Compare/Medicare-Hospital-Spending-by-Claim/nrth-mfg3
  16. Population 0-14
    INTERVAL
    Databank, Worldbank
    URL:  https://databank.worldbank.org
  17. Population 15-64
    INTERVAL
    Databank, Worldbank
    URL:  https://databank.worldbank.org
  18. Population 65 Above
    INTERVAL
    Databank, Worldbank
    URL:  https://databank.worldbank.org
  19. Population US
    INTERVAL
    Databank, Worldbank
    URL:  https://databank.worldbank.org
  20. SSDI Amount EndofYear(M)
    INTERVAL
    SSA, Social Security Administration
    URL:  https://www.ssa.gov/OACT/ProgData/ fyOps.html
  21. SSDI Avg Mthly Payment
    INTERVAL
    SSA, Social Security Administration
    URL:  https://www.ssa.gov/cgi-bin/awards.cgi
  22. SSDI Net Increase During Year(M)
    INTERVAL
    SSA, Social Security Administration
    URL:  https://www.ssa.gov/OACT/ProgData/ fyOps.html
  23. SSDI Total Expenditures(M)
    INTERVAL
    SSA, Social Security Administration
    URL:  https://www.ssa.gov/OACT/ProgData/fyOps.html
  24. SSDI Total Receipts(M)
    INTERVAL
    URL:  https://www.ssa.gov/OACT/ProgData/fyOps.html
  25. State Hospital Avg Spending Per Episode
    INTERVAL
    URL:  https://data.medicare.gov/Hospital-Compare/Medicare-Hospital-Spending-by-Claim/nrth-mfg3
  26. State Hospital Percent of Spending Per Episode
    INTERVAL
    URL:  https://data.medicare.gov/Hospital-Compare/Medicare-Hospital-Spending-by-Claim/nrth-mfg3
  27. Total Awarded SSDI
    INTERVAL
    SSA, Social Security Administration
    URL:  https://www.ssa.gov/cgi-bin/awards.cgi
  28. Total Employment 15 Above
    INTERVAL
    Databank, Worldbank
    URL:  https://databank.worldbank.org/data/source/jobs#
  29. Total Income Tax(Est)
    INTERVAL
    IRS, Internal Revenue Service
    URL:  https://www.irs.gov/statistics/soi-tax-stats-historical-table-3
  30. Total National Health Expenditures(M)
    INTERVAL
    CMS, Centers for Medicare and Medicaid Services
    URL:  https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/NationalHealthExpendData/
    NationalHealthAccountsHistorical.html

Data Preparation

The data preparation process includes the following five steps:

  • First, the appropriate data variables
    were selected from various online Tables.

  • Second, the data content was saved in
    relevant Microsoft Excel Worksheets.

  • Third, some variables were renamed as appropriate names and missing values were replaced with Zeros via Microsoft Excel.

  • Fourth, the two original Worksheets were combined via the Tableau Prep 2019.1 Builder Application which I created a Tableau Prep Builder Flow to connect the data files, to clean the data files, to create unions based on the Date (Year) variable, and to run the flow to create the merged output data file.

  • Fifth, fields were removed from the original data set because the data either was out of scope for the analysis or contained unneeded Text data:
     Removed the GDP_US(M) variable and kept GDP_US(B) variable.
    Removed Population(M) variable and kept Population variable.
    Removed Hospital_Claim_Type variable.
    Removed Hospital_Name variable.
    Removed Hospital_Provider_ID variable.
    Removed Hospital_State variable.

High-Level Data Diagram

DRAW.IO
DATA SET RELATIONSHIP DIAGRAM (UNION)

Tableau Prep Builder 2019.1 Flow Procedures

Connect, Clean, Union

Missing Variable Values Replacements

The original data set’s TimeID or Date Range was 1960-2018 however I found that I needed to reduce the data set and replace missing values because the 2016 CMS Hospital Compare columns, which contains the bulk of the data set’s observations, caused data gaps before and after the year 2016.  For example, the Hospital_Avg_Spending_Per_Episode_Nation variable and the Hospital_Percent_of_Spending_Nation variable columns had data gaps or missing values (25% of the data set size) for the years 1960-2015 and a large amount of rows for the year 2016 which the remaining variables had data gaps after the first row for the year 2016.  The final data set’s TimeID range was changed from 1960-2018 to 2001-2016 to improve the data quality for the predictive model to avoid unbalanced data set issues or data gap issues caused by missing input values which I found that the Median Imputation values changed the Target variable’s Correlation values so I resolved the data gaps with Zero replacements using Microsoft Excel.  However, I kept various Tableau frequency trends visualizations from the first combined data set (1960-2018) to highlight several decades of historical data insights.

SAS® Enterprise Miner 14.3 Impute Missing Values Node

Distribution of Incomplete Observations

(Extracted from Reporter Node Output)

Variable Distribution Training Data

          Missing       Number of       Percent of

Obs    for TRAIN    Variables           Variables

 1         69290               25                    86.2069

2            67                    4                    13.7931

Zero Replacements Using Office 365 Excel

SAS® Enterprise Guide 7.1
Simple Statistics (CORR Procedure)

Part 1

Part 1

SAS® Enterprise Guide 7.1
Correlation Key Performance Indicator
(CORR Procedure Continued)

r > 0 = Strong Correlation Relationship with Target Variable

Part 2

Part 2

Office 365 Excel Worksheet Data Analysis Correlation Results with Tableau 2019.1 Visualization

r > 0 = Strong Correlation Relationship with Target Variable

New Lessons Interface or E-Book

The remaining project content is published as a CourseCraft E-course which is an user-friendy Lessons format interface and the small one time fee allows unlimited access to all Lessons (Or, request the E-Book).

The Data Analytics Research and Prediction Project Demonstration Course highlights quality work from my UMUC 2019 M.S. Data Analytics Capstone Project:  Social Security and Medicare Trust Funds Deficit Concern. (Term GPA:  “A” and Professor’s Feedback:  “Great Work”).

SIGN-UP TO SEE THE PROJECT’S RESULTS
60-DAYS UNLIMITED ACCESS!

OR, REQUEST THE E-BOOK

Request the E-Book via the Contact Menu Link or the Course

References

ASPE. Assistant Secretary for Planning and Evaluation. U.S. Department of Health and Human Services. (2005). Social Security and Medicare from a Trust Fund and budget perspective. Retrieved from, https://aspe.hhs.gov/basic-report/social-security-and-medicare-trust-fund-and-budget-perspective

Data.Medicare.gov (n.d.). Medicare hospital spending by claim. Retrieved from, https://data.medicare.gov/Hospital-Compare/Medicare-Hospital-Spending-by-Claim/nrth-mfg3

Heylighen, F. (1997, Feb. 17 and 1993, Aug.). Cybernetics. Principia Cybernetica. Retrieved from, http://pespmc1.vub.ac.be/CYBERN.html

Heylighen, F., & Joslyn, C. (1992, Nov. 1). Principia Cybernetica. What is systems theory? Retrieved from, http://pespmc1.vub.ac.be/SYSTHEOR.html

Holden PLLC, M. (2018, October 30). Social Security and Medicare changes you should know about in 2019. Retrieved from, https://www.mholdencpa.com/2018/10/30/social-security-medicare-changes-2019/

Medicare.gov.  (2019).  Medicare costs at a glance.  Retrieved from, https://www.medicare.gov/your-medicare-costs/medicare-costs-at-a-glance

Social Security.  (n.d.).  Components of Trust Fund Income and Outgo.  https://www.ssa.gov/OACT/ProgData/financialItems.html

Social Security. (n.d.). Contribution and benefit base. Retrieved from, https://www.ssa.gov/oact/COLA/cbb.html

Social Security. (n.d.). Social Security & Medicare tax data. Retrieved from, https://www.ssa.gov/oact/ProgData/taxquery.html

Social Security. (n.d.). Status of the Social Security and Medicare Programs. A summary of the 2018 Annual Reports. Social Security and Medicare Boards of Trustees. Retrieved from, https://www.ssa.gov/OACT/TRSUM/index.html

Thwink.org. (2014). What is an analytical approach? Retrieved from, http://www.thwink.org/sustain/articles/000_AnalyticalApproach/index.htm

United States Department of Labor. Bureau of Labor Statistics. Labor force statistics from the current population survey. (n.d.). Labor force characteristics. Retrieved from, https://www.bls.gov/cps/lfcharacteristics.htm#emp

United States Department of Labor. Bureau of Labor Statistics. Labor force statistics from the current population survey. (n.d.). Labor force characteristics. Retrieved from, https://www.bls.gov/cps/lfcharacteristics.htm#unemp

United States Department of Labor. Bureau of Labor Statistics. Labor force statistics from the current population survey. (n.d.). What is seasonal adjustment? Retrieved from, https://www.bls.gov/cps/seasfaq.htm

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.