Fine-Tuning LLaMA 3.2 for Positive Conversations: Should 'Bad' Examples Be Included in Training Data?" #146791
Replies: 2 comments
-
Quick Take: Detailed Explanation:
Some toughts: |
Beta Was this translation helpful? Give feedback.
-
Hi @Manojkiran-G, Thanks for being a part of the GitHub Community, we're glad you're here! If you're looking for help for this specific topic, you might want to try asking for help somewhere that focuses on this project. It's possible that another GitHub user might have run into this same issue and can help, but the GitHub Community Discussions focuses primarily on topics related to GitHub itself or collaboration on project development and ideas. We want to make sure you’re getting the best support you can, but this space may not be the right place for this particular topic. Best of luck! |
Beta Was this translation helpful? Give feedback.
-
Hey guys , I'm currently working on fine-tuning llama 3.2 model for a use case involving various conversations. These conversations include both "good" (positive, respectful, and engaging) and "bad" (negative, disrespectful, or inappropriate) examples, and my goal is to train the model to maintain a positive tone and avoid generating harmful or inappropriate responses.
However, I’m unsure whether I should include the "bad" conversations in the training data. On one hand, including them might help the model learn to identify what makes a conversation go "wrong" and recognize patterns associated with negative tone, which could help it avoid making similar mistakes. On the other hand, I worry that including these "bad" conversations could lead the model to pick up undesirable patterns or behaviors, potentially causing it to generate responses with a negative tone, or even diluting the focus on positive behavior during training.
I’m curious if anyone here has worked on a similar challenge or has any advice on how to best handle this. Should I exclude the "bad" conversations entirely and focus only on good examples, or is it beneficial to incorporate them for the purpose of learning from both sides of the conversation? Would love to hear your thoughts!
Beta Was this translation helpful? Give feedback.
All reactions