Augmenting Healthcare AI: Fine-Tuning Llama 2 on COVID-19 Patient Data for Accurate Diagnosis and Treatment Recommendations
Overview
This case study examines how the Llama 2 model was fine-tuned using a dataset of COVID-19 patients to enhance its accuracy in diagnosing and recommending treatments. The project aims to develop a model that can support healthcare professionals in the efficient management of COVID-19 cases by utilizing organized medical data.
The refined model shows considerable promise in analyzing patient information, recognizing underlying health issues, and proposing suitable treatment plans customized to each patient’s needs. Additionally, this project addresses the challenges faced during the model’s development and discusses potential improvements for its practical use in real-world settings.
Primary Goals
Get the model to suggest meds based on patient lab results and medical history.
Train the model to respond to healthcare questions using a curated COVID-19 dataset.
Compare to the base Llama 2 model to measure improvement in relevance and accuracy.
Fine-Tuning Llama 2 on COVID Patient Data
Phase 1: Problem Statement and Objectives
This experiment aimed to fine-tune Llama 2 to help with healthcare tasks specific to COVID-19 patients. The goals were:
Medicine Suggestions: Get the model to suggest meds based on user-provided lab results and be contextually accurate and safe.
Knowledge Retrieval: Train the model to respond to common COVID-19 patient questions using the dataset.
Phase 2: Data Collection and Selection
The dataset for fine-tuning was from the Synthea website, comprising 16 CSV files. These files contained:
Patient Records: Demographic and health data for individual patients.
Lab Test Results: Lab data, including COVID-19 test results.
Medication Details: Prescriptions and medicine data.
Care Plans: Treatment plans and follow-ups.
Data Filtering:
Relevance Analysis: Each file was filtered to be relevant to the project goals. Files with irrelevant, duplicate, or too general data were removed.
Focused Selection: Files with critical patient data, lab results, and medication details were kept for fine-tuning.
Dataset Snapshots:
Patient Data CSV: Includes information, namely, patient ID, demographics (age, gender, race, etc.), underlying conditions, COVID-19 status, etc.
Care Plan CSV: Personalized care plans for COVID-19 patients
Medication CSV: Medications prescribed for COVID-19 patients
Phase 3: Data Preprocessing
1. Column Selection
Only the columns relevant to the project goals were kept from the filtered CSV files.
Patient Identifiers: Patient ID and demographics for context.
Lab Test Results: Diagnostic indicators and test results.
Conditions and Diagnoses: Patient conditions and medical diagnoses.
Prescribed Medications: Medications, dosages, and administration schedule.
Care Plans: Care plan descriptions and goals.
We wanted to focus on data, allowing the model to suggest accurate patient-specific recommendations and respond to questions.
2. Data Conversion
The filtered data was converted to a JSON-like format for fine-tuning the Llama 2 model. This conversion was to match the model’s input format and train the model.
Key Modifications:
Key-Value Pair: Patient attributes (e.g., ID, demographics, test results) were converted to key-value pairs for readability and simplicity.
Structured Prompts: Data was re-formatted to include structured prompts that guide the model’s response. For example, prompts explicitly state whether the context is for medicine suggestions, question answering, or lab report interpretation.
Instructional Data: Additional prompt instructions were added to help the model understand the tasks. For example:
Questions about symptoms, meds, or conditions were framed as questions.
Care plan descriptions were added to guide the model on follow-up and management suggestions.
3. Final Dataset Format
The processed dataset was formatted in a JSON-like format for Llama 2 model input. Here is an example of the data:
Data Format Breakdown:
Patient Information: Patient IDs, demographics, and health coverage.
Care Plans: Care plan descriptions and reasons (e.g., diabetes management, dementia management).
Conditions: Diagnosed conditions and medical history (e.g., cardiac arrest, Alzheimer’s, anemia).
Medications: Prescribed medications and administration (e.g., Memantine hydrochloride).
Refined Structure: A JSON-like format makes pairing patient information with prompts for model fine-tuning effortless.
Clearer Instructions: Including explicit prompts will help the model generate more accurate and task-specific answers.
Data Consistency: Key-value pairs will standardize the input format and reduce training ambiguity.
This preprocessing pipeline got the data clean, structured, and Llama 2 compatible.
Phase 4: Model Fine-Tuning
1. Hardware Utilization
Kaggle’s free-tier GPU was used to fine-tune, ensuring cost-effectiveness. Configuration:
GPU: NVIDIA Tesla P100
RAM: 15 GB
This way, fine-tuning was done without incurring extra costs and using high-performance computing resources.
2. Dataset Size
Approx 700-800 patient records were used for fine-tuning. These records included structured patient data like demographics, lab test results, medications, care plans, and conditions. Curated and preprocessed datasets ensured the model could learn from relevant and meaningful inputs.
3. Fine-Tuning time
Fine-tuning took around 6-7 hours. The model’s parameters were adjusted to the specific healthcare dataset during this time. The training was monitored to ensure stable learning without overfitting.
4. Fine-Tuning Process
Goal: Enable the Llama 2 model to:
Suggest medicines based on patient lab reports and care plans.
Answer COVID-19-related questions accurately and contextually.
Training Approach:
Supervised learning with a curated JSON-like dataset.
Prompts were designed to mimic real-world scenarios:
“Recommend medicines for this patient based on lab results.”
“Explain care plan for diabetes and Alzheimer’s.”
The model learned to generate responses in the context of healthcare scenarios.
5. Fine-Tuning Process Snapshot
Here’s a high-level example of the training workflow and sample input-output pairs used during fine-tuning:
Example Training Input:
Phase 5: Model Inference and Testing
After fine-tuning, the model was deployed for inference to test its ability to generate accurate and contextually relevant responses based on the trained dataset. Testing was divided into 2 parts: testing on real-world healthcare scenarios and testing for data memorization.
1. Inference
The fine-tuned Llama 2 model was tested during inference with user input like lab test reports, patient symptoms, and medical history. The model was asked to generate:
Short Summaries: Summarize the patient’s medical condition based on test reports and symptoms.
Medication Suggestions: Medicines, treatments, and care plans from the COVID-19 dataset.
The goal was to see how well the model applied the training to generate accurate, relevant, and actionable insights.
2. Memorization Testing
Specific patient details from the training data were asked to test the model’s ability to memorize. While the model retrieved patient information in the same structured format as the training data, some errors were observed:
Some fields were misreported or missing.
Some responses had repeated or irrelevant medication suggestions.
This shows that more fine-tuning or retrieval-based techniques are needed for highly accurate responses when dealing with memorized data.
3. Example: Inference
Question:
Summarize the patient’s report based on our COVID-19 records and suggest recommended medications to treat disease from the COVID-19 dataset.
Patient Report:
Patient ID: d4g67891-e21f-45a3-jk5lm-789001d60793
Gender: F
Birthplace: Revere, Massachusetts
Medical History: Type 2 Diabetes, Hypertension, Seasonal Influenza (last year)
Current Symptoms: High Fever, Dry Cough, Fatigue, Shortness of Breath, Headache
0.25 ML Enoxaparin Sodium 100 MG/ML Prefilled Syringe
4. Testing observations
Strengths:
The model generated accurate diagnoses based on test results and symptoms.
Medications were relevant and in line with the training data for COVID-19 and related conditions (e.g., Acetaminophen for fever, Enoxaparin Sodium for anticoagulation therapy in hypoxia or pneumonia).
Weaknesses:
Repetition: The model repeated some medication suggestions unnecessarily. This might be due to the way the output is formatted.
Memorization: While the model could recall data in a structured format, some fields were misreported or missing in some tests.
Comparing Fine-Tuned Model Results With Base Model
We compared the fine-tuned Llama 2 model with the base model on the COVID-19 dataset to see how well the fine-tuned model absorbed domain knowledge from the dataset compared to the base model’s general knowledge.
Here’s an example:
Example Input
Question: Summarize the patient’s report based on our COVID-19 records and suggest recommended medications to treat disease from the COVID-19 dataset.
Patient Report:
Patient ID: a1e05955-b51f-36a4-ganed-798928c20840
Gender: F
Birthplace: Boston, Massachusetts, US
Medical History: Suspected COVID-19, Hypertension, Diabetes
Oseltamivir (Tamiflu) 75mg PO twice a day for 10 days.
Azithromycin 250mg PO once a day for 5 days.
Check blood glucose and adjust as needed.
Check blood pressure and adjust as needed.
Additional Remarks: Given the patient’s history of suspected COVID-19, hypertension and diabetes, and current symptoms of cough, dyspnea, and wheezing, it is likely the patient has developed pneumonia or ARDS due to COVID-19.
The model provides an additional explanation for its medication recommendations:
Oseltamivir (Tamiflu): Effective against influenza A and B, reduces severity and duration of symptoms.
Azithromycin: Treats bacterial infections, which are common secondary infections in COVID-19 patients.
Limitations:
The suggested medications (e.g., Oseltamivir) are more for influenza than for COVID-19.
The base model doesn’t consider COVID-19-specific medications such as anticoagulants or oxygen therapy.
No clear differentiation between influenza, bacterial infections, and COVID-19 conditions.
Fine-Tuned Model (With COVID-19 Dataset):
Diagnosis:
COVID-19 (Probable)
Hypertension
Diabetes (Uncontrolled)
Recommended Medications:
Azithromycin 500mg PO once a day for 5 days.
Levofloxacin 500mg PO once a day for 5 days.
Omeprazole 40mg PO once daily for 5 days (for potential GI side effects from meds).
Acetaminophen 1000mg PO every 4 hours as needed (for fever and pain).
Ibuprofen 400mg PO every 6 hours as required (alternative for inflammation).
Aspirin 81mg PO thrice daily as needed (for antiplatelet therapy).
Lisinopril 5mg PO once a day (for hypertension).
Metformin 500mg PO twice a day (for diabetes).
Levothyroxine 100mcg PO once a day (if thyroid function is impaired).
Albuterol: 2 puffs PO every 4 to 6 hours as needed (for wheezing and dyspnea).
Explanation:
The fine-tuned model concludes the patient’s symptoms and test results confirm COVID-19 diagnosis with suspected hypertension and uncontrolled diabetes, increasing the risk of severe outcomes. Based on fine-tuned knowledge:
Azithromycin and Levofloxacin for secondary bacterial infections and respiratory support.
Antipyretics (Acetaminophen and Ibuprofen) for symptom management.
Aspirin reduces the risk of clotting disorder,s which are common in severe COVID-19.
Medications for pre-existing conditions (Lisinopril and Metformin) to manage hypertension and diabetes.
Improvements over the base model:
The fine-tuned model better understands COVID-19-specific complications and medications.
Multi-dimensional recommendations for COVID-19 and pre-existing conditions.
The medications list is more specific and realistic using the COVID-19 dataset.
Comparison Highlights
Base Model:
Relying on general medical knowledge.
Oseltamivir (Tamiflu) was more for influenza than COVID-19.
No specificity in managing COVID-19 symptoms and risks.
Fine-tuned model:
Accurate and specific recommendations for COVID-19.
Covered primary symptoms and complications (e.g. hypertension, diabetes, bacterial infections).
Shows a deeper understanding of the healthcare dataset, medications are from the training data.
Summary
The comparison shows the benefits of fine-tuning Llama 2 on a domain-specific dataset. The base model relied on general medical knowledge, the fine-tuned model has more contextual understanding and can generate more accurate and actionable responses for COVID-19 scenarios. Fine-tuning large language models for specialized tasks in healthcare makes sense.
2nd Attempt: Comparison of Fine-Tuned Model Results With Base Model
As part of the evaluation, the fine-tuned model was compared to the base (original) model to see if it could generate more accurate and relevant responses. Below is the comparison using a patient report as input.
Example Input
Question:
Summarize the patient’s report based on our COVID-19 records and suggest recommended medications to treat disease from the COVID-19 dataset.
Patient Report:
Patient ID: b2f06788-c62e-47b9-abc3-1234567890ab
Gender: Male
Birthplace: Los Angeles, California, US
Medical History: Suspected COVID-19, Asthma, Hyperlipidemia
Current Symptoms: Fever, Cough, Shortness of Breath
Based on the patient’s history and symptoms, they likely have COVID-19.
Recommended Medications:
Oseltamivir (Tamiflu): Antiviral to reduce the severity of COVID-19 symptoms.
Zanamivir (Relenza): Antiviral for COVID-19 symptoms.
Bronchodilators: Albuterol or ipratropium bromide for asthma symptoms.
Statins: Atorvastatin or simvastatin for hyperlipidemia and cardiovascular risk.
Additional Notes:
The base model provided a high-level summary and acknowledged the patient’s risk factors of asthma and hyperlipidemia. However, the recommendations are too general and not specific to COVID-19 treatment protocols:
Oseltamivir and Zanamivir are antivirals for influenza, not COVID-19, and not relevant here.
Bronchodilators and statins are for asthma and hyperlipidemia, but no COVID-19-specific treatments like corticosteroids or antibiotics for secondary infections
Fine-Tuned Model (With COVID-19 Dataset):
Diagnosis:
COVID-19 (RT-PCR Confirmed).
Asthma (potential exacerbation).
Hyperlipidemia.
Recommended Medications:
Azithromycin 250 mg PO daily for 5 days: For secondary bacterial infections.
Prednisone 10 mg PO daily for 5 days: Corticosteroid for inflammation and asthma exacerbation.
Albuterol 10 mg via inhaler daily for 5 days: Bronchodilator for asthma symptoms and shortness of breath.
Omeprazole 40 mg PO daily for 5 days: For gastrointestinal side effects of corticosteroids.
Amlodipine 20 mg PO daily for 5 days: For hypertension risk.
Acetaminophen 1000 mg PO as needed for fever.
Paracetamol 500 mg PO as needed for fever and pain.
Comparison Highlights
Fine-Tuned Model Strengths:
Contextual Recommendations:
A fine-tuned model recommends azithromycin (antibiotic) for secondary bacterial infections seen in COVID-19 patients with asthma.
Prednisone for inflammation which is critical for asthma exacerbation in COVID-19.
Antipyretics (Acetaminophen and Paracetamol) for fever and pain. It shows an understanding of symptom management.
Holistic:
The model has medication for primary (COVID-19) and secondary conditions (asthma exacerbation, hyperlipidemia). It shows an understanding of the patient’s overall needs.
Specificity in Dosage:
The fine-tuned model has detailed dosages and duration for all medications.
Fine-Tuned Model Limitations:
Some medications are repeated (e.g. Prednisone, Albuterol, Omeprazole, and Acetaminophen are listed multiple times). Need to refine post-processing logic to remove duplicates.
Base Model Limitations:
Generic Responses:
The base model recommends antivirals like Oseltamivir and Zanamivir which are not the first line of treatment for COVID-19. These are general medical knowledge and not domain-specific.
No COVID-19 Specificity:
The base model does not prescribe corticosteroids (Prednisone) or antibiotics (Azithromycin) for COVID-19 complications.
Missing Symptom Management:
The base model does not recommend antipyretics or bronchodilators in sufficient detail.
Summary
A fine-tuned model is better than a base model because it offers:
COVID-19 Specific treatments: Recommendations for primary (COVID-19) and secondary conditions (asthma exacerbation, hyperlipidemia).
Specific Dosage and Duration: More specific treatment plans.
Full Symptom Management: Medications for fever, inflammation, and respiratory distress.
While a fine-tuned model is better, more work is needed to remove duplicates and refine the responses. This comparison shows the value of fine-tuning a general model for a specific domain like healthcare.
3rd Attempt: Fine-Tuned Model vs Base Model
Here we test the base model (unfine-tuned) and fine-tuned model on a new patient case. We want to see if both models can recommend medications and provide a summary of the specific conditions in the patient’s report.
Example Input
Question:
Summarize the patient’s report based on our COVID-19 records and suggest recommended medications to treat disease from the COVID-19 dataset.
Patient Report
Patient ID: c3d07899-d73f-48c0-bcd4-9876543210cd
Gender: Female
Birthplace: Miami, Florida, US
Medical History: Suspected COVID-19, Chronic Kidney Disease (Stage 3), Obesity
Current Symptoms: Sore throat, Fatigue, Body aches
Acetaminophen (Tylenol): For body aches and fever.
Ibuprofen (Advil, Motrin): For fever and body aches.
Azithromycin (Zithromax): Antibiotic for bacterial infections.
Hydroxychloroquine (Plaquenil): For fatigue and joint pain.
Prednisone: For inflammation.
Additional Notes:
Base model gave several recommendations but many were too general and included medications with limited or no evidence for COVID-19 such as Hydroxychloroquine. The response did not take into account the patient’s specific conditions (e.g. chronic kidney disease and obesity) and did not follow established COVID-19 treatment guidelines.
Limitations:
Oseltamivir and Hydroxychloroquine are not recommended for COVID-19.
No mention of the patient’s chronic kidney disease (CKD) or adjustment of medications for CKD.
Medications that are contraindicated or require dose adjustment for CKD (e.g. Ibuprofen).
Fine-Tuned Model (With COVID-19 Dataset):
Diagnosis:
COVID-19 (RT-PCR positive).
Chronic Kidney Disease (Stage 3).
Obesity.
Recommended Medications:
Prednisone 50 mg PO daily: To reduce inflammation and manage symptoms like fatigue and sore throat.
Acetaminophen 500 mg PO as needed: For body aches and mild fever.
Albuterol 10 mg via inhaler as needed: For shortness of breath due to obesity and COVID-19.
Omeprazole 50 mg PO daily: To prevent GI side effects of prednisone.
Lisinopril 50 mg PO daily: For cardiovascular risk of obesity and CKD.
Tylenol 500 mg PO as needed: Alternative to acetaminophen for pain.
Additional Notes:
The fine-tuned model gave medications for COVID-19 symptoms while considering the patient’s CKD and obesity.
Antipyretics (e.g. Acetaminophen) and anti-inflammatory medications (e.g. Prednisone) were chosen to manage symptoms without worsening CKD.
Recommendation to avoid NSAIDs like Ibuprofen shows consideration of CKD as these medications can worsen kidney function.
Comparison Highlights
Fine-Tuned Model Strengths:
Contextual Recommendations:
Medications were tailored to the patient’s specific conditions (e.g. CKD and obesity) in addition to COVID-19 symptoms.
Avoidance of harmful medications (e.g. NSAIDs) shows a better understanding of CKD risks.
COVID-19 Protocols:
The fine-tuned model did not include controversial or less effective medications like Oseltamivir or Hydroxychloroquine.
Medications like Prednisone and Acetaminophen are used for COVID-19 and its symptoms.
Gastrointestinal Protection:
Omeprazole to prevent GI side effects of corticosteroids (Prednisone) shows understanding of medication interactions.
Comorbidities:
Medications for obesity and CKD (e.g. Lisinopril) show a whole patient approach.
Base Model Weaknesses:
Broad Responses:
The base model included antivirals like Oseltamivir which are not the first line for COVID-19.
Hydroxychloroquine is outdated and no longer recommended for COVID-19.
No Comorbidity Consideration:
The base model did not consider the patient’s CKD or obesity when recommending medications. For example, NSAIDs like Ibuprofen can worsen CKD but the base model didn’t account for that.
Medication Safety:
The base model didn’t consider contraindications or dose adjustment for CKD.
COVID-19 Relevance:
Hydroxychloroquine and Tamiflu show a lack of knowledge of COVID-19 treatment protocols.
Summary
The fine-tuned model is better than the base model in the:
Accuracy: Contextual recommendations for COVID-19, CKD, and obesity.
Safety: Avoidance of medications contraindicated for CKD and inclusion of protective measures (e.g. Omeprazole).
Clinical Relevance: Alignment with current COVID-19 treatment protocols, no outdated or controversial therapies.
While the fine-tuned model still has some redundancy (e.g. repeated Acetaminophen and Tylenol) it’s more reliable and context-aware than the base model. More refinement (e.g. output clarity and no repetition) would make it usable in the real world.
4th Attempt: Fine-Tuned Model Vs Base Model
This example compares the fine-tuned Llama 2 model with the base model when asked to summarize a patient’s report and recommend treatment based on COVID-19 records and comorbidities.
Example Input
Question:
Summarize the patient’s report based on our COVID-19 records and suggest recommended medications to treat disease from the COVID-19 dataset.
Patient Report Summary:
Patient ID: d4e08910-e84f-49c1-bcde-1234567890ef
Gender: Male
Birthplace: Seattle, Washington, US
Medical History: Suspected COVID-19, COPD, Heart Disease
Current Symptoms: High Fever, Persistent Cough, Chest Tightness
The patient seems to have a severe case of COVID-19, complicated by pneumonia or acute respiratory distress syndrome (ARDS). Additionally, pre-existing conditions, including COPD and heart disease, likely worsen their symptoms.
Recommended Medications:
Oseltamivir (Tamiflu): Antiviral to reduce symptoms.
Zanamivir (Relenza): Another antiviral to alleviate COVID-19 symptoms.
Bronchodilators: For COPD symptoms.
Diuretics: To manage fluid buildup that causes heart failure.
Limitations:
The base model included generic antivirals (e.g. Oseltamivir and Zanamivir) which are not effective for COVID-19, and show a lack of knowledge on COVID-19 treatment protocols.
No antibiotics, corticosteroids, or oxygen therapy are recommended for the patient’s suspected conditions (e.g. COPD exacerbation and heart disease).
No mention of the patient’s risk of severe complications due to COPD and heart disease.
Fine-Tuned Model (With COVID-19 Dataset):
Summary:
The patient has COVID-19 and is having complications, COPD exacerbation, and symptoms of heart disease. His high fever and persistent cough may be pneumonia or another severe respiratory infection. Treatment is needed for both COVID-19 and underlying conditions.
Recommended Medications:
Azithromycin 250 mg PO once daily for 10 days: For secondary bacterial infections.
Levofloxacin 500 mg PO once daily for 10 days: Broad spectrum antibiotic for respiratory infections.
Oxygen Therapy: Via nasal cannula 2 liters/minute to improve oxygen saturation.
Albuterol 2 mg via nebulizer once daily for 10 days: To relieve bronchoconstriction and support respiratory function.
Ipratropium 0.5 mg via nebulizer once daily for 10 days: To support respiratory function in COPD.
Prednisone 50 mg PO once daily for 5 days: Corticosteroid to reduce inflammation and COPD exacerbation.
Atropine 0.4 mg IM once daily for 3 days: For severe respiratory or cardiac distress.
Comparison Highlights
Fine-Tuned Model Strengths:
Specific to COVID-19 and Comorbidities:
The fine-tuned model includes COVID-19 and COPD exacerbation and heart disease-specific treatments.
Antibiotics like Azithromycin and Levofloxacin are recommended for secondary bacterial infections seen in severe COVID-19 cases.
Respiratory Management:
Albuterol and Ipratropium nebulizers are included for COPD management.
Oxygen therapy is used to address respiratory distress.
Evidence-Based Treatments:
Prednisone is used to reduce inflammation, as per the standard of care for severe COVID-19 and COPD exacerbation.
No outdated or ineffective treatments (e.g. Oseltamivir, Zanamivir) are included, indicating improved context understanding from fine-tuning.
Whole Patient Approach:
The fine-tuned model considers immediate symptoms (e.g. fever, cough, chest tightness) and underlying conditions (e.g. COPD, heart disease).
Fine-Tuned Model Weaknesses:
Duplicate Medications:
Prednisone and Albuterol are duplicated in the response. Post-processing needs to be improved to remove duplicates.
Too Broad:
Atropine is not always necessary. Fine-tuning can remove less relevant treatments.
Base Model Weaknesses:
Generic and Wrong:
The base model suggests antivirals Oseltamivir and Zanamivir which are not for COVID-19.
Does not include oxygen therapy, antibiotics, or corticosteroids for the patient’s specific conditions.
Comorbidity Ignorance:
The base model does not account for the patient’s underlying conditions (e.g. COPD and heart disease) when suggesting medications.
Diuretics are too general and not applicable without more information on the patient’s cardiac status.
General Observations About Results
Overall Notes
Fine-Tuned Model Performance:
This model showcases significant improvements in providing domain-specific and evidence-based recommendations for patients with comorbidities.
Includes treatments from fine-tuned datasets (antibiotics, nebulizers, oxygen therapy) and aligns with current COVID-19 treatment guidelines.
To Improve:
The model sometimes repeats medication names or includes irrelevant options (e.g. Atropine). Optimization in response generation is needed.
Post-processing to remove duplicates.
Next:
Fine-tune to fix the root causes of errors and reduce duplication.
Continuously evaluate and refine for real-world use.
Summary
The fine-tuned model is better than the base model in the:
Relevance: Aligns with COVID-19 treatment guidelines.
Safety: Treatments are for comorbidities like COPD and heart disease.
Holistic: The model includes a plan for both disease and underlying conditions.
More work is needed, but a fine-tuned model is deployable with further refinement. To get professional AI consulting services for your project, connect with our professional AI consultants, who will understand your requirements, challenges, and provide the right solution to your project requirements.
Phase 6: Hurdles and Future Endeavours
Challenges
Inaccurate Data Generation:
A fine-tuned model is better than a base model but sometimes generates inaccurate or irrelevant output.
Problems:
Duplicate Medications: Some output contains repeated medication names making the response less concise and harder to read.
Irrelevant Treatments: The model sometimes suggests medications that are not necessary or applicable to the patient’s condition (e.g. Atropine or certain NSAIDs for patients with chronic kidney disease).
Memorization Error: During testing, the model sometimes retrieves wrong patient data and produces output that is not consistent with the input.
Comorbidities:
The model is good at identifying comorbidities like COPD and heart disease but sometimes misses how these comorbidities interact with COVID-19.
The model does not prioritize treatments that account for contraindications (e.g. avoid NSAIDs in chronic kidney disease).
Dataset Limitations:
The fine-tuned dataset is comprehensive but may not cover all edge cases or rare conditions so there are gaps in the model.
This is evident when dealing with complex cases with multiple comorbidities.
Output Formatting:
Output is not consistent in formatting and phrasing so cannot be used as is for practical purposes.
Responses are not structured and some are redundant or too verbose.
Future Endeavours
Improve Model Accuracy:
More focus on fixing medication recommendations by:
Adding Edge Cases: Include rare or complex medical scenarios in the dataset for fine-tuning.
Fine-Tune Iterations: Fine-tune with more curated and preprocessed data to refine the model’s context.
Error Analysis: Analyze errors in the output to find patterns of inaccuracies.
Remove Redundancy in Output:
Post-processing logic will:
Remove duplicate medication suggestions.
Make output concise and actionable.
Real-World Feedback:
Work with medical professionals to test the model in a controlled real-world setting and gather feedback.
Use feedback to fine-tune and improve response quality and relevance.
Comorbidities
Additional datasets for specific comorbid conditions (e.g. diabetes, CKD, asthma) will be added to the model to handle complex cases.
Will focus on contraindications and individual patient needs.
Output Formatting:
Create a template for all output to be uniform and readable. For example:
Diagnosis: Separate from treatment recommendations.
Medications: List with exact dosage, duration, and purpose.
Additional Notes: Brief justification for recommended treatment.
Retrieval-Augmented Techniques
Add a retrieval mechanism so the model can reference an external knowledge base for more accurate and up-to-date recommendations.
This will combine the fine-tuned model’s context with external resources.
Evaluation Metrics:
Create metrics to evaluate the model performance on:
Relevance: Are the suggestions relevant to the input?
Accuracy: Are the treatments in line with current medical guidelines?
Clarity: Is the output concise and not redundant?
Scalability:
Can scale the model to a bigger and more diverse dataset with global medical guidelines and region-specific COVID-19 protocols.
Can be used by more users and healthcare systems.
Summary
Despite the issues, the fine-tuned Llama 2 model holds a lot of potential to revolutionize healthcare by generating personalized patient treatment recommendations. It can be forged into a highly credible and useful healthcare tool by overcoming its current limitations and making iterative enhancements to the model.
With more fine-tuning, better error analysis, and collaboration with medical professionals, the model’s reliability, relevance, and usability can be enhanced significantly, thus allowing the model to be broadly adopted in live clinical environments. Get professional AI solutions for healthcare challenges and revolutionize how you manage patient data, treatments, and everything.
Need Expert AI Fine-Tuning?
Discuss your project requirements with our experts.