Table of Contents
Computational Biology Seminar
BIOSC 1630
Fall 2024 • University of Pittsburgh • Department of Biological Sciences
Topics in computational biology will be explored using primary literature. Students will present research articles orally and complete a series of writing assignments that will culminate in producing a literature review paper.
Overview¶
Computational biology as a field moves extremely fast and is communicated almost exclusively through scientific literature. Most courses in the computational biology degree teach you computer science or biology outside the context of the field. This course—at least my version of it—provides time and space to upskill their computational biology knowledge by routinely reading primary research.
The instructor will assign a scientific article across various computational biology subfields for students to read each week. Early in the semester, our focus will be learning and gaining experience digesting and understanding the article. As the semester progresses, we will practice critiquing our articles, ensuring you are prepared and confident in your understanding of the material.
License¶
Code contained in this project is released under the GPLv3 license as specified in LICENSE.md
.
All other data, information, documentation, and associated content provided within this project are released under the CC BY-NC-SA 4.0 license as specified in LICENSE_INFO.md
.
Why are we using copyleft licenses? Many people have dedicated countless hours to producing high-quality materials incorporated into this website. We want to ensure that students maintain access to these materials.
Web analytics¶
Why would we want to track website traffic?
An instructor can gain insights into how students engage with online teaching materials by analyzing web analytics. This information is instrumental in assessing the effectiveness of the materials. Web analytics reveal the popularity of specific topics or sections among students and empower instructors to tailor future lectures or discussions. Analytics also provides valuable data for curriculum development, helping instructors identify trends, strengths, and weaknesses in course materials. Additionally, instructors may leverage web analytics as evidence of their commitment to continuous improvement in teaching methods, which is helpful in discussions related to professional development, promotions, or tenure.
We track website traffic using plausible, which is privacy-friendly, uses no cookies, and is compliant with GDPR, CCPA, and PECR.
Syllabus ↵
Syllabus¶
Semester: Fall 2024
Meeting time: Wednesdays from 1:00 - 3:30 pm.
Location: 302 Cathedral of Learning
Instructor: Alex Maldonado, PhD (he/him/his)
Email: alex.maldonado@pitt.edu
Office hours: No routine office hours will be provided.
Meetings can be scheduled on a case-by-case basis.
Catalog description¶
Topics in computational biology will be explored using primary literature. Students will present research articles orally, as well as complete a series of writing assignments that will culminate in the production of a literature review paper.
Prerequisites¶
You must be a Junior or Senior on the CBUAS-BS and CBUSCI-BS plan. You must also have completed BIOSC 1540 with a minimum grade of C and one of the following courses:
- ENG 0102;
- ENGCMP (0002 or 0006 or 0020 or 0200 or 0203 or 0205 or 0207 or 0208 or 0210 or 0212);
- ENGFLM 0210;
- FP (0003 or 0006).
Writing intensive¶
BIOSC 1630 will fulfill the requirement for a writing-intensive course in the Computational Biology major. The Dietrich School of Arts & Sciences Writing Institute defines this as
- a course in which students engage with writing substantively throughout the term;
- they write and revise throughout the term (not just at the end);
- they write a total of 5750—6250 words;
- they get feedback from their teacher and their peers.
Note
Five hundred words is approximately one page single-spaced and two pages double-spaced. You are expected to write 11--13 single-spaced pages or 23--25 double-spaced pages in this course.
Course philosophy¶
Clear and efficient communication is a crucial—and often neglected—aspect of science. Intentional or unintentional miscommunication of research and data can result in adverse consequences; some examples include a flawed Challenger launch decision, inconsistent usage of units with the Mars Climate Orbiter, poor communication between experts and scientists with the public during the COVID-19 pandemic, and rampant misinformation. Lectures, assignments, and activities for this course are designed to teach you the tools to communicate effectively and comprehend scientific literature in computational biology.
What makes good written or visual communication? The answer is highly subjective. I argue that there is no "correct way" to communicate, and it depends on both the material's producer and consumer. Some things could make papers or presentations incomprehensible, but everything else that turns "acceptable" into "excellent" is a matter of taste.
Furthermore, it's essential to recognize that computational biology is a rapidly evolving landscape. To keep up-to-date, we must ensure our working knowledge is broad enough to comprehend and incorporate advancements quickly. We'll look at many different topics in computational biology, exploring its complex details and the changes it leads to.
My principal goals for this course are to equip you with the tools to
- Navigate and understand the various subfields of computational biology;
- Organize, draft, revise, and finish preparing writings and presentations;
- Recognize what aspects hamper written (e.g., grammar, organization, formatting) and visual (e.g., color, design, pace) communication;
- Discover your voice and style of communication;
- Process and digest information from a variety of different scientific sources.
Outcomes¶
After successfully completing this course, students should be able to do the following.
- Efficiently search for and identify relevant scientific literature.
- Effectively summarize scientific literature's motivation, methods, and critical findings.
- Critically evaluate the robustness and validity of methods and analyses.
- Interpret and draw meaningful conclusions from computational data.
- Assess the transparency, reproducibility, and adherence to open science practices.
- Understand the interplay between computational and experimental corroboration.
- Communicate scientific ideas and data effectively through clear, concise, and well-structured writing.
- Deliver engaging and informative presentations on scientific literature.
Schedule¶
Jump to this week.
DRAFT
This page is a work in progress and is subject to change at any moment.
Week 1¶
Wednesday (Aug 28) Lecture 01
Week 2¶
Wednesday (Sep 4) Lecture 02
Week 3¶
Wednesday (Sep 11) Lecture 03
Week 4¶
Wednesday (Sep 18) Lecture 04
Week 5¶
Wednesday (Sep 25) Lecture 05
Week 6¶
Wednesday (Oct 2) Lecture 06
Week 7¶
Wednesday (Oct 9) Lecture 07
Week 8¶
Wednesday (Oct 16) Lecture 08
Week 9¶
Wednesday (Oct 23) Lecture 09
Week 10¶
Wednesday (Oct 30) Lecture 10
Week 11¶
Wednesday (Nov 6) Lecture 11
Week 12¶
Wednesday (Nov 13) Lecture 12
Week 13¶
Wednesday (Nov 20) Lecture 13
Thanksgiving break¶
No class on Nov 27.
Week 15¶
Wednesday (Dec 4) Lecture 14
Finals¶
No final exam will be administered for this course.
Assessments¶
DRAFT
This page is a work in progress and is subject to change at any moment.
Distribution¶
The course will have the following point distribution.
- Paper
- Theme analysis: 6%
- Literature review: 8%
- Introduction: 6%
- Field overview: 12%
- Analysis: 12%
- Draft: 8%
- Final draft: 18%
- Pre-class assignments: 10%
- Activities: 20%
Late Assignments and Extensions¶
I am mindful of the diverse nature of deadlines, particularly in the scientific realm. Some are set in stone, while others exhibit more flexibility. Notably, the scientific community frequently submits manuscripts and reviews days, weeks, or months after the editor's request. Such practices are widely understood. Conversely, submitting a grant application even a minute past the deadline makes it ineligible for review.
I will use the following late assignment and extension policy. It encourages timely submissions while acknowledging the influence of external commitments and unforeseen circumstances.
- Each assignment has a specified due date and time.
- Assignments submitted after the due date will incur a late penalty.
-
The late penalty is calculated using the function: % Penalty = 0.01 (1.4 \(\times\) hours late)2 rounded to the nearest tenth. This results in approximately:
Hours late Penalty 6 0.7% 12 2.8% 24 11.3% 48 45.2% 72 100.0% -
I will not accept assignments 72 hours (3 days) after the due date.
- The penalty is applied to the assignment's total possible points. For example, if an assignment is worth 100 points and is submitted 36 hours late, the penalty would be approximately 13 points.
- To reward punctuality, each assignment submitted on time will earn a 2% bonus on that assignment's score.
- These on-time bonuses will accumulate throughout the semester and will be added to your final course grade.
Exceptions to this policy could be made on a case-by-case basis for extenuating circumstances. Please communicate with me as early as possible if you anticipate difficulties meeting a deadline.
By submitting assignments on time, you can earn up to 2% extra credit on your final grade, depending on the number and weight of assignments in the course. If you are close, this bonus can bump your grade to the next highest letter grade. Consequently, I will not round up any final grades.
Scale¶
I will assign letter grades for this course based on Pitt's recommended scale (shown below).
Letter grade | Percentage | GPA |
---|---|---|
A + | 97.0 - 100.0% | 4.00 |
A | 93.0 - 96.9% | 4.00 |
A – | 90.0 - 92.9% | 3.75 |
B + | 87.0 - 89.9% | 3.25 |
B | 83.0 - 86.9% | 3.00 |
B – | 80.0 - 82.9% | 2.75 |
C + | 77.0 - 79.9% | 2.25 |
C | 73.0 - 76.9% | 2.20 |
C – | 70.0 - 72.9% | 1.75 |
D + | 67.0 - 69.9% | 1.25 |
D | 63.0 - 66.9% | 1.00 |
D – | 60.0 - 62.9% | 0.75 |
F | 0.0 - 59.9% | 0.00 |
Policies¶
Generative AI¶
We are in an exciting area of generative AI development with the release of tools such as ChatGPT, DALL-E, GitHub Copilot, Bing Chat, Bard, Copy.ai, and many more. This course will permit these tools' ethical and responsible use except when explicitly noted. For example, you can use these tools as an on-demand tutor by explaining complex topics.
Other ways are undoubtedly possible, but any use should aid—not replace—your learning. You must also be aware of the following aspects of generative AI.
-
AI limitations: While AI programs can be valuable resources, they may produce inaccurate, biased, or incomplete material. Each program has its unique limitations as well.
-
Bias and accuracy: Scrutinizing each aspect of these enormous data sets used to train these products is infeasible. AI will inherit biases and inaccuracies from these sources and human influences in fine-tuning. You must be critical and skeptical of anything generated from these models and verify information from trusted sources.
-
Critical thinking: Understand that AI is a tool, not a replacement for your analysis and critical thinking skills. AI to enhance your understanding and productivity, but remember that your development as a scholar depends on your ability to engage independently with the material.
-
Academic integrity: Plagiarism extends to content generated by AI. Using AI-generated material without proper attribution is a violation of academic integrity policies. Always give credit to AI-generated content and adhere to citation rules.
Furthermore, text from AI tools should be treated as someone else's work—because it is. You should never copy and paste text directly.
-
AI detection: As discussed here, the University Center for Teaching and Learning does not recommend using AI detection tools like turnitin due to high false positive rates. I will not use AI detection tools in any capacity for this course and trust that you will use these tools responsibly when permitted and desired.
Remember that generative AI is helpful when used responsibly. You can ethically benefit from these technological advances by adhering to these guidelines. Embrace this opportunity to expand your skill set and engage thoughtfully with emerging technologies. If you have any questions about AI tool usage, please get in touch with me for clarification and guidance.
Equity, diversity, and inclusion¶
The University of Pittsburgh does not tolerate any form of discrimination, harassment, or retaliation based on disability, race, color, religion, national origin, ancestry, genetic information, marital status, familial status, sex, age, sexual orientation, veteran status or gender identity or other factors as stated in the University's Title IX policy. The University is committed to taking prompt action to end a hostile environment that interferes with the University's mission. For more information about policies, procedures, and practices, visit the Civil Rights & Title IX Compliance web page.
I ask that everyone in the class strive to help ensure that other members of this class can learn in a supportive and respectful environment. If there are instances of the aforementioned issues, please contact the Title IX Coordinator, by calling 412-648-7860 or emailing titleixcoordinator@pitt.edu. Reports can also be filed online. You may also choose to report this to a faculty/staff member; they are required to communicate this to the University's Office of Diversity and Inclusion. If you wish to maintain complete confidentiality, you may also contact the University Counseling Center (412-648-7930).
Academic integrity¶
Students in this course will be expected to comply with the University of Pittsburgh's Policy on Academic Integrity. Any student suspected of violating this obligation during the semester will be required to participate in the procedural process initiated at the instructor level, as outlined in the University Guidelines on Academic Integrity. This may include, but is not limited to, the confiscation of the examination of any individual suspected of violating University Policy. Furthermore, no student may bring unauthorized materials to an exam, including dictionaries and programmable calculators.
To learn more about Academic Integrity, visit the Academic Integrity Guide for an overview. For hands-on practice, complete the Understanding and Avoiding Plagiarism tutorial.
Disability services¶
If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources and Services (DRS), 140 William Pitt Union, (412) 648-7890, drsrecep@pitt.edu
, (412) 228-5347 for P3 ASL users, as early as possible in the term.
DRS will verify your disability and determine reasonable accommodations for this course.
Email communication¶
Upon admittance, each student is issued a University email address (username@pitt.edu
).
The University may use this email address for official communication with students.
Students are expected to read emails sent to this account regularly.
Failure to read and react to University communications promptly does not absolve the student from knowing and complying with the content of the communications.
The University provides an email forwarding service that allows students to read their email via other service providers (e.g., Gmail, AOL, Yahoo).
Students who forward their email from their pitt.edu
address to another address do so at their own risk.
If email is lost due to forwarding, it does not absolve the student from responding to official communications sent to their University email address.
Religious observances¶
The observance of religious holidays (activities observed by a religious group of which a student is a member) and cultural practices are an important reflection of diversity. As your instructor, I am committed to providing equivalent educational opportunities to students of all belief systems. At the beginning of the semester, you should review the course requirements to identify foreseeable conflicts with assignments, exams, or other required attendance. If possible, please contact me (your course coordinator/s) within the first two weeks of the first class meeting to allow time for us to discuss and make fair and reasonable adjustments to the schedule and/or tasks.
Sexual misconduct, required reporting, and Title IX¶
If you are experiencing sexual assault, sexual harassment, domestic violence, and stalking, please report it to me and I will connect you to University resources to support you.
University faculty and staff members are required to report all instances of sexual misconduct, including harassment and sexual violence to the Office of Civil Rights and Title IX. When a report is made, individuals can expect to be contacted by the Title IX Office with information about support resources and options related to safety, accommodations, process, and policy. I encourage you to use the services and resources that may be most helpful to you.
As your instructor, I am required to report any incidents of sexual misconduct that are directly reported to me. You can also report directly to Office of Civil Rights and Title IX: 412-648-7860 (M-F; 8:30am-5:00pm) or via the Pitt Concern Connection at: Make A Report.
An important exception to the reporting requirement exists for academic work. Disclosures about sexual misconduct that are shared as a relevant part of an academic project, classroom discussion, or course assignment, are not required to be disclosed to the University's Title IX office.
If you wish to make a confidential report, Pitt encourages you to reach out to these resources:
- The University Counseling Center: 412-648-7930 (8:30 A.M. TO 5 P.M. M-F) and 412-648-7856 (AFTER BUSINESS HOURS)
- Pittsburgh Action Against Rape (community resource): 1-866-363-7273 (24/7)
If you have an immediate safety concern, please contact the University of Pittsburgh Police, 412-624-2121
Any form of sexual harassment or violence will not be excused or tolerated at the University of Pittsburgh.
For additional information, please visit the full syllabus statement on the Office of Diversity, Equity, and Inclusion webpage.
Statement on classroom recording¶
To ensure the free and open discussion of ideas, students may not record classroom lectures, discussions and/or activities without the advance written permission of the instructor, and any such recording properly approved in advance can be used solely for the student's private use.
Ended: Syllabus
Lectures ↵
Lectures¶
TODO:
Lecture 01: Research anatomy ↵
Lecture 01
Course overview and anatomy of research
Date: Aug 28, 2024
Learning objectives¶
After today's lecture, you should be able to:
- Understand course structure and expectations.
- Outline the requirements and goals of the perspective primers.
- Compare and contrast different types of scientific articles.
- Explain the key components of the research ecosystem.
- Apply effective strategies to find relevant literature.
Presentation¶
- Live link: slides.com/d/E2MQpyk/live
- Download: biosc1630-l01.pdf
Activity
Literature search
This activity is designed to help you practice conducting a focused, efficient literature search on a computational biology topic, critically analyze key papers, and identify potential research directions. This will help you prepare for your homework assignment and paper. You will submit your completed assignment as a PDF through Gradescope.
You may work in groups, but each student must submit their own individual work.
Tip
- Pick a topic that interests you and go with it.
- Focus on skimming and extracting key information rather than reading papers in-depth.
- When identifying gaps, consider what's missing across multiple papers, not just one study.
Rapid topic exploration (10 minutes)¶
- Choose a perspective primer as your topic for this activity. Write this down in your document.
- Use Google Scholar to find one to two recent review papers (within the last five years) on your chosen topic.
- Skim the reviews and write down:
- Your chosen topic.
- At least three key subtopics or methodologies you identified from the review papers.
Focused Literature Search (20 minutes)¶
- Develop focused search strategies using at least four specific keywords or phrases based on your initial exploration.
- Conduct additional Google Scholar searches using your strategies.
Use advanced search features like:
- Exact phrase search: Use quotation marks to search for an exact phrase, e.g.,
"machine learning"
. - Author search: Use
author:
followed by the name, e.g.,author:"J Smith"
. - Publication search: Use
source:
to search within a specific publication, e.g.,source:"Nature"
. - Date range: Use
after:
andbefore:
to specify publication dates, e.g.,after:2020 before:2023
. - Title search: Use
intitle:
to search only in article titles, e.g.,intitle:robotics
. - Exclude terms: Use
-
before a word to exclude it from results, e.g.,machine learning -supervised
. - Operator: Use
AND
orOR
(in capitals) to search for either term, e.g.,AI OR "artificial intelligence"
. - Wildcard: Use
*
as a wildcard for unknown words, e.g.,effect of * on climate change
.
- Exact phrase search: Use quotation marks to search for an exact phrase, e.g.,
- Identify and select at least five primary research articles that seem most relevant and impactful for your primer.
- Write down:
- Your search strategies (the keywords and any filters you used).
- The titles and authors of your selected articles.
- At least two takeaways for each article.
Synthesis and Gap Identification (10 minutes)¶
Based on your analysis, address the following:
- Identify a common thread or trend across the papers.
- Identify your papers' approaches to address this common trend.
- Identify one potential gap or unexplored area in the current research.
Reflection (5 minutes)¶
Briefly reflect on the following:
- What did you learn about the literature search process?
- How might you apply these skills in future assignments or research projects?
- What challenges did you face during this activity, and how did you overcome them?
Ended: Lecture 01: Research anatomy
Lecture 02: Reading literature ↵
Lecture 02
Reading literature
Date: Sep 4, 2024
DRAFT
This page is a work in progress and is subject to change at any moment.
In today's lecture, we will read the following primary research article in computational biology together.
- Cao, Q., Ge, C., Wang, X., Harvey, P. J., Zhang, Z., Ma, Y., ... & Yu, R. (2023). Designing antimicrobial peptides using deep learning and molecular dynamic simulations. Briefings in Bioinformatics, 24(2), bbad058. DOI: 10.1093/bib/bbad058
Learning objectives¶
After today's lecture, you should be able to:
- Identify the main question, hypothesis or innovation, and significance.
- Evaluate and understand the methods.
- Interpret figures, tables, and data visualizations.
- Identify the key takeaways and conclusions.
- Apply critical thinking skills to assess the strengths, limitations, and biases.
Presentation¶
- Live link: slides.com/d/E2MQpyk/live
Activities ↵
Activity
Active introduction reading
The introduction of a scientific paper sets the stage for the entire study. It provides context, outlines the problem, and presents the study's objectives. The SQ3R method (Survey, Question, Read, Recite, Review) is an effective active reading strategy that can help you extract maximum value from this crucial section.
Objectives¶
By the end of this activity, you should be able to:
- Apply the SQ3R method to read and analyze the introduction of a computational biology paper
- Identify key components of an introduction (background, problem statement, objectives)
- Formulate relevant questions to guide your reading
- Synthesize and summarize the main points of an introduction
Instructions¶
1. Applying SQ3R¶
The SQ3R method is a reading comprehension technique designed to help readers effectively process and retain information from written material. It stands for Survey, Question, Read, Recite, and Review. This structured approach encourages active engagement with the text, promoting better understanding and recall of the content.
a. Survey (5 minutes)¶
With: Yourself
Begin by surveying the introduction. Quickly skim through the text, focusing on section headings, the first and last sentences of each paragraph, any emphasized text (bold or italicized), and any figures or tables present. This initial overview helps you grasp the general structure and main ideas of the introduction.
Deliverable
While you are surveying, write down and list any key headings, topics, and visual elements.
Tip
Pay special attention to the structure of the introduction. It often follows a pattern:
- Background/Context
- Problem Statement or Knowledge Gap
- Proposed Solution or Study Objectives
b. Question (10 minutes)¶
With: Your group
Based on your survey, you should write down questions that you expect this introduction to answer. These questions should guide your subsequent reading and help you engage more deeply with the content. Consider asking about the
- the current state of knowledge in the field,
- the specific problem or gap the study addresses,
- the main objectives of the research, and
- the proposed methods or approaches.
c. Read (15 minutes)¶
With: Yourself
During this time, carefully read the introduction, actively seeking answers to the questions you formulated earlier. As you read, highlight or annotate key information to aid in your understanding and later review.
Tip
Look for signpost phrases that indicate important information, such as:
- "The aim of this study is..."
- "We hypothesize that..."
- "To address this problem, we..."
d. Recite (10 minutes)¶
With: Your group
After reading, recite the main points of the introduction in your own words. This step reinforces your understanding and helps identify any areas that may require further clarification.
Tip
Try explaining the introduction to a classmate. Teaching others is an excellent way to solidify your understanding.
e. Review (5 minutes)¶
With: Your group
Review and reflect on how the information in the introduction relates to the overall study. Consider how it sets up the rest of the paper and what predictions you can make about the methods or results based on this introduction.
Tip
Consider creating a concept map or flowchart to visualize how the ideas in the introduction connect to each other and to the broader study.
2. Class Discussion and Synthesis (10 minutes)¶
With: Everyone
Be prepared to share your experience applying the SQ3R method. Discuss the key components you identified in the introduction. Compare the questions you formulated and answers you found with your classmates.
Activity
Method scrutiny
Understanding and critically evaluating the methodology of a scientific paper is crucial for assessing the validity and reliability of its findings. This activity will guide you through the process of mapping out the experimental design, linking methods to results, and critically analyzing the sufficiency of the methodology.
Tip
- Be objective: Try to assess the methodology based on its scientific merit, not on whether you agree with the conclusions.
- Consider the context: Think about the state of the field and available technologies at the time the study was conducted.
- Look for justifications: Good papers often explain why specific methods or analyses were chosen.
- Think about alternatives: Consider whether there might have been better ways to address the research question.
- Be constructive: When identifying limitations, think about how they could be addressed in future studies.
Instructions¶
Create a Flowchart of the Experimental Design¶
With: Your group
Warning
Today's paper did this for us, but I am including it for future reference.
- Read through the methods section carefully.
- Identify the main steps of the experimental process.
- Create a flowchart using the provided template or a tool of your choice.
- Start with data collection/preparation
- Include key analytical or experimental steps
- End with final analyses or validations
Example flowchart structure:
graph TD
A[Step 1: e.g., Data Collection] --> B[Step 2: e.g., Model Training]
B --> C[Step 3: e.g., Validation]
C --> D[Step 4: e.g., Testing]
A1[Results 1] --- A
B1[Results 2] --- B
C1[Results 3] --- C
D1[Results 4] --- D
Analyze Contribution to Research Question¶
With: Your group
Write a brief explanation for each step of how it contributes to answering the main research question. Consider:
- What specific aspect of the research question does this step address?
- How do the results from this step support or refute the hypothesis?
- Are there any limitations in how this step addresses the research question?
Critical Evaluation of Methodology¶
With: Your group
Assess the sufficiency and limitations of the methodology by considering the following questions:
- Appropriateness: Are the methods suitable for addressing the research question? Do the techniques used align with current best practices in the field?
- Completeness: Are all necessary controls included? Are sample sizes adequate for the conclusions drawn?
- Potential Biases: Are there any sources of bias in the experimental design? How might these biases affect the results?
- Reproducibility: Is there enough detail provided to reproduce the experiments? Are any key methodological details missing?
- Statistical Analyses: Are the statistical methods appropriate for the data? Are the statistical analyses correctly interpreted?
- Limitations: What are the main limitations of the methodology? How do these limitations affect the interpretation of the results?
Warning
Since this is our first paper, I do not expect you to answer these confidently. As you get more experienced you will be mindful of things you have seen in the past.
Class Discussion¶
With: Everyone
Be prepared to share your group's insights with the class. Consider:
- What were the most significant strengths and weaknesses identified in the methodology?
- Were there any aspects of the methodology that were particularly innovative or noteworthy?
- How might the authors have improved their experimental design?
Activity
Data interpretation
Figures and tables are a crucial component of scientific papers, often conveying complex information in a visual format. This guide will help you develop the skills to effectively analyze and interpret figures in computational biology papers.
Tip
- Start with the big picture: What's the overall trend or pattern?
- Pay attention to labels and units: Understanding what's being measured is crucial.
- Look for comparisons: How do different groups or conditions differ?
- Consider error bars: They provide important information about data variability.
- Think critically: Just because it's published doesn't mean it's perfect. What would you do differently?
- Connect to the text: How does the information in the figure relate to what's described in the results and discussion?
Objectives¶
By the end of this activity, you should be able to:
- Critically analyze scientific figures
- Extract key information from complex visualizations
- Relate figure content to the overall study objectives
- Identify potential limitations and improvements in data presentation
Instructions¶
For each figure under "Results and discussion", we will do the following.
Small Group Analysis¶
With: Your group
- Initial Observation: Look at the figure carefully. Read the figure caption thoroughly. Identify the type of figure (e.g., graph, chart, diagram, image).
- Detailed Analysis:
Answer the following questions:
- What is the main message of this figure?
- What do the axes represent? (If applicable)
- What do different colors, shapes, or symbols represent?
- Are there any trends, patterns, or outliers you can identify?
- How does this figure support the study's motivation?
- Group Discussion:
- Share your individual interpretations with your group.
- Discuss any differences in your interpretations.
- Try to reach a consensus on the figure's main message and how it relates to the study.
Figure Analysis Guide¶
With: Your group
For each figure, work through the following steps:
- Describe what you see (Observation):
- List all the elements present in the figure (e.g., bars, lines, points, labels).
- Note any visual hierarchies or groupings.
- Identify the scale and units used.
- Explain what it means (Interpretation):
- Translate the visual elements into scientific concepts.
- Describe the relationships between different elements.
- Identify the main trend or result being shown.
- Relate it to the study's objectives (Relevance):
- How does this figure address the research question?
- What specific hypothesis or prediction does it support or refute?
- How does it fit into the larger narrative of the paper?
- Identify any limitations or potential improvements (Critical Thinking):
- Are there any aspects of the data that are not represented?
- Is the chosen visualization the most effective for this data?
- How could the figure be improved for clarity or impact?
- Are there any potential sources of bias or misinterpretation?
Class Discussion¶
With: Everyone
Be prepared to share your group's insights with the class.
Activity
Skeptical results
After analyzing figures, it's crucial to compare your interpretations with the authors' claims in the results section. This activity will help you develop critical thinking skills and understand how visual and textual information work together in scientific papers.
Tip
- Be objective: Try to set aside your initial interpretations and read the results text with an open mind.
- Look for nuances: Sometimes the differences between your interpretation and the authors' claims may be subtle.
- Consider the broader context: The results text may provide important information about experimental conditions or analyses that aren't evident in the figures alone.
- Be critical but fair: It's okay to question the authors' interpretations, but also consider whether they have access to information or analyses that aren't fully represented in the figures.
- Think about communication: Reflect on how effectively the authors have presented their results through both figures and text.
Warning
We may not get to this activity, but that is okay!
Objectives¶
By the end of this activity, you should be able to:
- Compare your figure interpretations with the authors' claims in the results text
- Identify any discrepancies between visual and textual information
- Critically evaluate the presentation of results in scientific papers
Instructions¶
1. Read the Results Section¶
With: Your group
As you read, take notes on:
- The authors' main claims about each figure
- Any statistical analyses or numerical results presented
- How the authors relate the results to their research questions or hypotheses
2. Compare and Contrast¶
With: Your group
For each results figure, create a comparison table:
Aspect | Your Interpretation | Authors' Claims | Notes on Differences/Similarities |
---|---|---|---|
Main Message | |||
Key Trends | |||
Relation to Objectives | |||
Statistical Support |
- Fill in your interpretations from step 1
- Add the authors' claims from the results text
- In the last column, note any differences or similarities between your interpretation and the authors' claims
3. Small Group Discussion¶
With: Your group
Discuss any significant differences between your interpretations and the authors' claims Consider possible reasons for these differences:
- Did you miss any important details in the figures?
- Did the authors provide additional context in the text that wasn't clear from the figures alone?
- Are there any claims in the text that don't seem fully supported by the figures?
4. Class Discussion¶
With: Everyone
Be prepared to share your group's insights with the class. Consider:
- What were the most common differences between student interpretations and authors' claims?
- Were there any figures that were particularly challenging to interpret without the text?
- Did the results text clarify any confusing aspects of the figures?
- Were there any cases where you feel the authors' claims weren't fully supported by the figures?
Ended: Activities
Ended: Lecture 02: Reading literature
Lecture 03
Paper 01 - Methods
Date: Sep 11, 2024
Today's paper: Champion, C., Gall, R., Ries, B., Rieder, S. R., Barros, E. P., & Riniker, S. (2023). Accelerating Alchemical Free Energy Prediction Using a Multistate Method: Application to Multiple Kinases. Journal of Chemical Information and Modeling, 63(22), 7133-7147. DOI: 10.1021/acs.jcim.3c01469
Learning objectives¶
What you should be able to do after today's lecture:
- Describe the basic stages of drug discovery and explain the role of computational methods in modern drug design.
- Identify the main types of molecular forces and explain how they relate to binding affinity and free energy.
- Explain basic concepts of statistical thermodynamics, including ensemble averages and the relationship between microscopic properties and macroscopic observables.
- Explain the basic principles of molecular simulations, including the concept of force fields and molecular dynamics.
- Differentiate between relative and absolute binding free energies and discuss their importance in drug design.
- Compare and contrast Free Energy Perturbation (FEP) and Thermodynamic Integration (TI), including their advantages and limitations.
- Describe the concept of alchemical transformations and explain how they differ from physical pathways in free energy calculations.
- Define the concept of sampling in molecular simulations and explain why enhanced sampling methods are necessary for accurate free energy calculations.
- Explain how replica exchange methods enhance sampling in molecular simulations and their application in free energy calculations.
- Describe the basic concept of EDS and how it differs from traditional free energy calculation methods.
- Explain how RE-EDS combines replica exchange with EDS and discuss its advantages over standard methods.
Presentation¶
- Live link: slides.com/d/1hTJNcg/live
Lecture 04
Paper 01 - Discussion
Date: Sep 18, 2024
Today's paper: Champion, C., Gall, R., Ries, B., Rieder, S. R., Barros, E. P., & Riniker, S. (2023). Accelerating Alchemical Free Energy Prediction Using a Multistate Method: Application to Multiple Kinases. Journal of Chemical Information and Modeling, 63(22), 7133-7147. DOI: 10.1021/acs.jcim.3c01469
Introduction¶
Duration: 15 minutes
- Welcome and session objectives (2 minutes)
- Brief recap of the paper and its significance (5 minutes)
- Explanation of the RE-EDS methodology and its context (5 minutes)
- Overview of the session format and expectations (3 minutes)
Group discussions and preparation¶
Divide students into 4 groups, each focusing on a specific part of the Results and Discussion. Please put your presentation in the Google Drive directory posted on the Canvas page.
Guiding questions:
- How might the structural differences among the NIK inhibitors (e.g., ring opening, fusion of rings) affect their binding mechanisms and the challenges in simulating them?
- What factors could contribute to the differences observed between GAFF and OpenFF force fields for NIK? How might these differences impact our understanding of protein-ligand interactions?
- Considering the role of Arg410 in NIK simulations, how might protein flexibility influence the accuracy of binding free energy calculations? What implications does this have for drug design?
- How do the results from RE-EDS compare to those from H-RE TI simulations for NIK? What might explain any differences observed?
- What are the potential advantages and limitations of using hybrid topology versus dual topology in the NIK simulations? How might this choice affect future free energy calculations?
Presentation (minimum) requirements:
- Explain the structural differences among the NIK inhibitors
- Compare and contrast the results obtained with GAFF and OpenFF
- Discuss the role of Arg410 in the NIK simulations and its different conformations
- Present the comparison between RE-EDS and H-RE TI results
- Explain the differences between dual and hybrid topology simulations
- Present the key metrics (MUE, RMSE, Kendall τ, Spearman ρ) for NIK
Data:
- Analyze Figure 6A and Table 1. Compare the performance of GAFF, OpenFF, and H-RE TI for NIK inhibitors. What trends do you notice in the calculated vs. experimental binding free energies?
- Examine Figure 6B. How does the choice of topology (dual vs. hybrid) affect the results for OpenFF? What might explain these differences?
- Study Figure 7 carefully. How do the different conformations of Arg410 relate to the simulation results? What does this suggest about protein flexibility in binding free energy calculations?
- Look at Figure S5 in the Supporting Information. How do the crystal structures of NIK inhibitors compare to the simulated conformations? What insights does this provide?
- Analyze the statistical metrics in Table 1 for NIK. What do these values tell you about the accuracy and reliability of the different methods and force fields?
Guiding questions:
- How do the different types of heterocycle transformations in the PAK dataset reflect real-world drug optimization strategies?
- What molecular or structural features might explain the differences in accuracy between PAK and NIK results?
- How could the binding pose of PAK inhibitors and the hydrogen bond with Lys299 inform structure-based drug design efforts?
- What implications does the difference in dihedral angle sampling between GAFF and OpenFF have for force field development and selection in future studies?
- How might the findings from the PAK simulations, particularly regarding buried substituents, influence the design of next-generation kinase inhibitors?
Presentation (minimum) requirements:
- Describe the PAK inhibitor modifications and their significance
- Compare the accuracy of PAK results to those of NIK
- Present the key metrics for PAK using both GAFF and OpenFF
- Explain the binding pose of PAK inhibitors and its impact on results
- Discuss the importance of the hydrogen bond with Lys299
- Analyze the differences in dihedral angle sampling between GAFF and OpenFF for PAK
Data:
- Examine Figure 8A and Table 2. Compare the performance of GAFF and OpenFF for PAK inhibitors. Which force field seems to perform better, and why?
- Study Figure 8B carefully. How does the binding pose of ligand K.1 inform your understanding of the PAK inhibitor interactions?
- Analyze Figures 8C, 8D, and 8E together. What do these figures tell you about the differences in dihedral angle sampling and hydrogen bonding between GAFF and OpenFF?
- Look at the correlation metrics (Kendall τ and Spearman ρ) in Table 2. What do these values suggest about the ranking ability of each force field for PAK inhibitors?
- Compare the MUE and RMSE values for PAK with those of other kinases in the study. What does this comparison tell you about the relative difficulty of modeling PAK inhibitors?
Guiding questions:
- Why might studying different subsets of CHK1 inhibitors be important, and what can we learn from comparing their results?
- How do the results from this study compare to previous methods like FEP+ and QligFEP for the smaller CHK1 subset? What might account for any differences or similarities?
- What factors could contribute to the higher accuracy of CHK1 results compared to other kinases? How might this inform future computational studies?
- How could the performance differences between GAFF and OpenFF for CHK1 guide force field selection in future drug discovery projects?
- Based on the CHK1 results, what recommendations would you make for improving the accuracy of binding free energy calculations in kinase systems with solvent-exposed substituents?
Presentation (minimum) requirements:
- Describe the two CHK1 subsets and their key characteristics
- Compare the results of the smaller subset to previous studies (e.g., FEP+, QligFEP)
- Present the key metrics for both CHK1 subsets
- Explain any differences in performance between GAFF and OpenFF
- Discuss factors that might contribute to the high accuracy of CHK1 results
Data:
- Analyze Figure 9A and the corresponding data in Table 3 for the smaller CHK1 subset. How do the results compare across GAFF, OpenFF, and the previous GROMOS force field?
- Examine Figure 9B and the related data in Table 3 for the larger CHK1 subset. What differences do you notice between the two subsets, and what might explain these differences?
- Study Figure S8 in the Supporting Information. How do the RE-EDS results for CHK1 compare to other methods like FEP+ and QligFEP?
- Look at Figure S9 in the Supporting Information. What does the sampling distribution tell you about the efficiency of the RE-EDS method for CHK1 inhibitors?
- Compare the statistical metrics for CHK1 in Table 3 with those of other kinases in the study. What conclusions can you draw about the relative ease or difficulty of modeling CHK1 inhibitors?
Guiding questions:
- How do the structural modifications in the PIM dataset, including bulky substituents, reflect current trends in kinase inhibitor design?
- What molecular or structural features might explain the differences in accuracy between PIM and CHK1 results?
- How could the challenges encountered with ligand M.4 (e.g., rearrangement of Arg122) inform future improvements in simulation protocols or force field parameterization?
- What are the potential implications of the performance differences between GAFF and OpenFF for PIM in the context of kinase drug discovery?
- How might the impact of bulky substituents on sampling and results influence the design and optimization of PIM inhibitors in the future?
Presentation (minimum) requirements:
- Describe the structural modifications in the PIM inhibitor set
- Present the key metrics for PIM and compare them to CHK1
- Explain the challenges encountered, particularly with ligand M.4
- Compare the performance of GAFF and OpenFF for PIM
- Discuss the impact of bulky substituents on sampling and results
Data:
- Analyze Figure 10A and Table 4. Compare the performance of GAFF and OpenFF for PIM inhibitors. What trends do you notice, and how do they differ from other kinases in the study?
- Examine Figure 10B carefully. How does the binding pose of ligands M.1 and M.4 inform your understanding of the challenges in modeling PIM inhibitors, especially those with bulky substituents?
- Study Figures S11 and S12 in the Supporting Information. What do these figures reveal about the differences in dihedral angle sampling between GAFF and OpenFF for ligand M.4?
- Look at Figure S10 in the Supporting Information. How does the sampling distribution for PIM compare to that of CHK1 (Figure S9)? What might explain any differences?
- Analyze the statistical metrics in Table 4 for PIM. How do these values compare to the null model, and what does this tell you about the performance of RE-EDS for PIM inhibitors?
For each group, consider
- What were the main findings in this section?
- How do these results compare to previous studies?
- What limitations or challenges were identified?
- What are the implications of these findings?
- How does this section contribute to the overall narrative of the paper?
Group Presentations¶
Each group presents their section and other groups ask questions after each presentation.
Break¶
Duration: 10 minutes
Critical Analysis Activity¶
This worksheet covers various aspects of the paper, encouraging students to engage deeply with the content, critically evaluate the methods and results, and consider the broader implications of the research. It also promotes scientific thinking by asking students to propose future directions and reflect on their learning. Students discuss in pairs (10 minutes), then share with the class (10 minutes)
Part 1: General Understanding¶
- Briefly summarize the main objective of this study in your own words.
- Explain the key differences between RE-EDS and traditional pairwise free energy methods.
- List the four kinases studied and describe one unique feature of each kinase's inhibitor set.
Part 2: Data Analysis¶
For your assigned kinase group (NIK, PAK, CHK1, or PIM), answer the following:
- Describe the main trends you observe in the calculated vs. experimental binding free energies.
- Compare the performance of GAFF and OpenFF force fields for your kinase. Which performed better and why?
- Identify a specific challenge in modeling your kinase's inhibitors and explain its potential impact on the results.
- Analyze the sampling distribution for your kinase. What does it tell you about the efficiency of the RE-EDS method?
Part 3: Critical Evaluation¶
- Identify 2 strengths of this study.
- Identify 2 weaknesses or limitations of this study.
- What additional experiments or analyses would you suggest to address the limitations or expand on the findings?
- How might these findings impact the field of computational drug design, particularly for kinase inhibitors?
- What ethical considerations might arise from this research and its potential applications?
Part 4: Comparative Analysis¶
- Compare the performance of RE-EDS across the four kinases. Which kinase showed the best results and why do you think this is the case?
- Examine Figure 11 and related data. How does the computational efficiency of RE-EDS compare to traditional methods? What implications does this have for drug discovery?
Part 5: Future Directions¶
- Based on the study's findings, propose a specific research question or hypothesis for future investigation in this field.
- Describe a potential application of the RE-EDS method outside of kinase inhibitor design. How might it be useful in other areas of drug discovery or computational chemistry?
Part 6: Reflection¶
- What aspect of this study did you find most interesting or surprising? Why?
- How has this paper changed or reinforced your understanding of computational methods in drug design?
- If you could ask the authors one question about their work, what would it be?
Part 7: Communication¶
- Create a brief (3-5 bullet points) summary of the key findings and significance of this study that could be understood by a non-expert audience.
Break¶
Duration: 10 minutes
Reflection¶
Group Discussion¶
In small groups of 3-4, discuss the following: 1. What are the three most important takeaways from this study? 2. How does this research advance our understanding of computational drug design? 3. What challenges or limitations of the RE-EDS method did you identify? 4. How might this method impact the drug discovery pipeline in pharmaceutical companies?
Summarize Key Points¶
As a class, create a list of the top 5 key points from the group discussions.
Individual Reflection¶
Write a brief reflection addressing the following:
- What is the most significant concept you learned from this paper? Explain why you find it important.
- How has this study changed your understanding of:
- Computational methods in drug design?
- The challenges in predicting protein-ligand interactions?
- Identify one aspect of the RE-EDS method that you'd like to understand better. Formulate this as a specific question.
- Imagine you're explaining this research to a friend who isn't in science. Write 2-3 sentences summarizing why this work matters.
- If you were to continue this research, what would be your next step or question to investigate?
Closing Thought¶
Consider and jot down one way this research might influence your future studies or career interests.
Lecture 05
Cancelled
Date: Sep 25, 2024
Danger
Class is cancelled due to the Cathedral of Learning being closed.
Lecture 06
Paper 02 - Methods & Discussion
Date: Oct 2, 2024
Today's paper: Zhu, W., Zhang, Y., Zhao, D., Xu, J., & Wang, L. (2022). HiGNN: A hierarchical informative graph neural network for molecular property prediction equipped with feature-wise attention. Journal of Chemical Information and Modeling, 63(1), 43-55. DOI: 10.1021/acs.jcim.2c01099
Learning objectives¶
What you should be able to do after today's lecture:
- Compare and contrast different molecular representations.
- Describe basic concepts of neural networks and deep learning as applied to molecular property prediction.
- Explain the fundamental principles of Graph Neural Networks (GNNs) and their application.
- Describe the concepts of message passing and aggregation in GNNs.
- Discuss the role of attention mechanisms in neural networks for molecular property prediction.
- Describe the concepts of chemical fragments, pharmacophores, and molecular scaffolds.
- Describe the main components of HiGNN's architecture.
Activity¶
For this journal club, you will be split into five groups, each responsible for presenting a portion of the paper “HiGNN: A Hierarchical Informative Graph Neural Network for Molecular Property Prediction Equipped with Feature-Wise Attention” by Zhu et al. Each group will prepare a set of lecture slides during class and present for 10 minutes, followed by a short Q&A.
Your task is to explain your assigned section clearly, ensuring your classmates understand the key points, and to engage the class in a discussion on the relevance and impact of the research.
- Preparation Time: You will have time in class to prepare your slides. Each group should create approximately 5-7 slides for their presentation.
- Presentation Time: Each group will present for 10 minutes, with an additional 5 minutes for questions from the audience.
- Content: Summarize the key points of your assigned section. You are encouraged to include visuals (e.g., figures, tables) from the paper to aid understanding.
- Discussion Questions: At the end of your presentation, ask 2-3 questions to engage the class in discussion about your section.
- Teamwork: Split the work evenly among group members. Each member should have a speaking role during the presentation.
- Focus: Highlight the core concepts, avoid getting too caught up in overly technical details unless they are essential to your section.
Group 1: Introduction and Background¶
Removed.
Group 2: HiGNN Framework Architecture¶
Assigned Sections:
- Methods: HiGNN Architecture (pp. 45-47, covering molecular graph and BRICS fragmentation)
Your Goals:
- Provide a detailed explanation of the HiGNN architecture.
- Focus on the hierarchical design of HiGNN and how it processes both molecular graphs and BRICS fragments.
- Explain the role of the feature-wise attention mechanism in recalibrating atomic features.
- Highlight how these architectural innovations lead to improved molecular property predictions.
Suggested Slide Breakdown:
- Overview of HiGNN architecture.
- Explanation of molecular graph processing.
- Introduction to BRICS fragmentation and its integration.
- Description of the feature-wise attention mechanism.
- How these components interact to improve predictions.
Discussion Questions:
- How does the hierarchical design of HiGNN differ from traditional GNNs in molecular property prediction?
- Why is the feature-wise attention mechanism a crucial innovation in this model?
Group 3: Experimental Setup and Data Sets¶
Assigned Sections:
- Methods: Benchmark Data Sets and Hyperparameters (pp. 48-49)
Your Goals:
- Explain the benchmark data sets used in the study and why they are relevant for drug discovery.
- Discuss the importance of using multiple data sets for evaluating model performance.
- Provide an overview of the training process, including the hyperparameter optimization.
- Mention the significance of splitting the data into training, validation, and test sets.
Suggested Slide Breakdown:
- Introduction to the data sets used in the study.
- Relevance of each data set for molecular property prediction (mention a few key data sets like ESOL, FreeSolv, BACE, etc.).
- Overview of the training process.
- Hyperparameter optimization and its role in the study.
- Significance of data splitting (random vs. scaffold splitting).
Discussion Questions:
- Why is it important to evaluate the model on a variety of data sets?
- How does scaffold splitting improve the generalizability of the model compared to random splitting?
Group 4: Results and Performance Analysis¶
Assigned Sections:
- Results and Discussion (pp. 49-50)
Your Goals:
- Summarize the model’s performance on different data sets.
- Compare HiGNN’s performance with other models such as GCN, GAT, and Chemprop.
- Highlight key findings, especially in tasks related to drug discovery, such as predicting ADMET properties.
- Discuss why HiGNN outperformed other models in most cases and what that implies for future research.
Suggested Slide Breakdown:
- Overview of performance results on key data sets.
- Comparison of HiGNN with other models (focus on top-performing models).
- Specific success stories (e.g., BACE, BBBP data sets).
- Discussion of HiGNN’s strength in predicting ADMET properties.
- What do these results mean for future applications?
Discussion Questions:
- In which areas does HiGNN significantly outperform other models, and why?
- What might be some limitations of HiGNN based on its performance across different tasks?
Group 5: Interpretability and Case Studies¶
Assigned Sections:
- Interpretation of HiGNN: Case Studies on BACE and BBBP (pp. 50-52)
Your Goals:
- Explain the molecular-fragment similarity mechanism and its role in making HiGNN interpretable.
- Use the BACE and BBBP case studies to demonstrate how HiGNN identifies key molecular fragments.
- Discuss how this interpretability can aid chemists in drug design.
- Highlight the practical implications of the findings from the case studies.
Suggested Slide Breakdown:
- Overview of HiGNN’s interpretability mechanism (molecular-fragment similarity).
- Case study 1: BACE (show how HiGNN identifies key fragments).
- Case study 2: BBBP (explain how permeability predictions work).
- Importance of model interpretability in drug discovery.
- Potential future applications of this interpretability.
Discussion Questions:
- How does the molecular-fragment similarity mechanism improve the interpretability of HiGNN’s predictions?
- Why is interpretability important in drug discovery models?
Lecture 07
Introduction editing
Date: Oct 9, 2024
Peer editing¶
As you read and review your peer’s introduction, focus on the following key areas. Provide constructive feedback aimed at helping your classmate improve their work. Use these questions and tips as a guide for your review.
Engaging Opening Statement¶
What to Look For:
- Does the introduction start with an engaging, clear statement that captures attention?
- Does it present a specific trend, challenge, or recent advancement that directly relates to the main topic of the paper?
Common Mistakes:
- Opening with broad or overly general statements like “Proteins are important for life.”
- Using vague language that doesn't engage the reader.
Example
Your opening statement is a little too broad. Try focusing on a recent breakthrough in your topic to make it more engaging. For example, you could highlight a new finding in molecular dynamics.
Background Funnel¶
What to Look For:
- Does the introduction flow logically from general information about the topic to a more specific focus?
- Is there enough context provided about the field to set the stage for the rest of the article?
- Does the background explain why the topic is important in the broader context of biology or drug discovery?
Common Mistakes:
- Jumping into specific details too quickly without providing enough background.
- Being overly vague and not explaining the significance of the topic.
Example
You need more context about why molecular docking is important in drug discovery before diving into specific limitations. Try adding a sentence that explains its role in finding potential drug targets.
Identifying Gaps¶
What to Look For:
- Does the introduction clearly identify gaps or challenges in current research?
- Are the gaps specific and relevant to the paper’s main argument?
Common Mistakes:
- Not clearly identifying gaps, or only vaguely mentioning that “there are limitations.”
- Failing to explain why these gaps are important.
Example
You mention that there are limitations in AlphaFold, but it’s not clear what they are. Try specifying exactly what AlphaFold struggles with, like disordered regions or large protein complexes.
Presenting the Thesis/Argument¶
What to Look For:
- Is the main thesis or argument of the paper clearly stated?
- Does the thesis provide a specific, debatable perspective that the paper will explore?
Common Mistakes:
- Having an unclear or vague thesis.
- Stating that the paper will “discuss” or “explain” the topic without providing an argument or perspective.
Example
Your thesis is a bit unclear. Instead of just saying ‘this paper will explore protein structure prediction methods,’ try something like ‘this paper argues that integrating deep learning with ab initio models offers a more accurate approach to protein structure prediction.'
Article Structure Outline¶
What to Look For:
- Does the introduction provide a brief outline of the article?
- Does the outline logically follow from the thesis and give the reader a clear sense of what’s coming next?
Common Mistakes:
- Missing the outline entirely or being vague about what the rest of the paper will cover.
Example
Your introduction could benefit from a brief outline of the structure. For example, mention that the paper will first explore AlphaFold, then compare it to ab initio methods, and finally propose a hybrid approach.
Clarity and Writing Style¶
What to Look For:
- Is the writing clear and easy to understand?
- Does the introduction avoid jargon or explain technical terms when necessary?
- Are sentences concise and to the point?
Common Mistakes:
- Overusing jargon or writing overly complex sentences.
- Having sentences that are too long or confusing.
Example
Some sentences are a bit hard to follow, especially the one about molecular dynamics. Try simplifying it by breaking it into two shorter sentences.
Use of Literature and Citations¶
What to Look For:
- Are relevant studies and seminal papers cited to provide background and support for the thesis?
- Are citations formatted correctly (APA style)?
Common Mistakes:
- Failing to include citations or using too few.
- Incorrect citation format (e.g., missing information, inconsistent formatting).
Example
You need more citations to support your background section. Try referencing a recent study that discusses the limitations of molecular docking.
Length¶
What to Look For:
- Is the introduction within the required length (around 500-750 words)?
- Does it avoid going into too much detail that should be saved for later sections of the paper?
Common Mistakes:
- Writing an introduction that is too long or too short.
- Including too many technical details that belong in the main body of the article.
Example
Your introduction is a bit too detailed in the discussion of MD simulations. Try saving some of this for later and keeping the introduction more focused on setting up the main argument.
General Improvements¶
What to Look For:
- Does the introduction as a whole flow logically and set up the rest of the paper?
- Are there any areas where the writing could be more concise or more clearly explained?
Common Mistakes:
- Disorganized flow between paragraphs or ideas.
- Being too wordy or using unclear explanations.
Example
I think the transition between your background section and your thesis could be smoother. Try revising the last sentence of the background to lead directly into your argument.
Additional Tips for Peer Review:¶
- Be constructive and respectful: Focus on how your feedback can help your peer improve. Avoid vague or negative comments like “This doesn’t make sense.” Instead, offer suggestions for improvement, like “This section could be clearer if you explained how molecular docking leads to faster drug discovery.”
- Give specific examples: Instead of saying “This is unclear,” explain why it’s unclear and suggest how to make it better.
- Balance praise with critique: Highlight what your peer has done well in addition to offering suggestions for improvement.
- Use track changes and comments: If you’re working digitally, use track changes to show edits and leave comments where something could be revised. If you’re working on paper, use clear markings and notes.
Alex's general feedback¶
Based on my review of the introductions submitted by the class, here is some general feedback that I believe will be helpful for everyone as they revise their drafts and engage in peer review:
Engagement with the Topic:
Many of you have done a good job introducing your topics and explaining the importance of your perspective within computational biology. However, several drafts could benefit from a more compelling opening statement that immediately engages the reader. Instead of broad or overly general statements, try to highlight a specific trend, challenge, or recent advancement in your field that directly connects to your main argument. This will set a strong tone for the rest of your paper.
Background Funnel:
The background section should smoothly transition from broader concepts to the specific focus of your paper. While most of you provide a good context, some drafts jump too quickly into specifics without adequately setting the stage. Make sure your background flows logically by introducing general concepts first, then gradually narrowing the focus to your particular area of interest.
Identifying Gaps:
One common area for improvement is the section identifying current gaps in research. This part of the introduction is crucial for justifying why your perspective is necessary. Some drafts touch on challenges but don't clearly articulate what key gaps your article will address. Focus on identifying one or two significant gaps in the literature or research landscape and briefly hint at how your perspective will contribute to filling them.
Introducing Your Perspective:
Several of you did well in introducing your main argument, but in some cases, the thesis statement could be clearer or more specific. Make sure to clearly state your unique perspective or the central argument of your paper. This statement should be debatable, setting the stage for the arguments you'll develop later. It’s not enough to merely summarize what’s known; instead, make sure to explain how your perspective adds to the conversation or offers a new take on the issue.
Use of Literature:
The literature cited in many drafts is generally relevant, but in some cases, it’s either too sparse or too dense. Try to strike a balance between citing foundational studies that set the stage for your argument and recent breakthroughs that are directly relevant to your perspective. Ensure that your citations are purposeful, helping to build the case for your perspective.
Writing Quality and Clarity:
Overall, the writing quality is strong, but some drafts could benefit from additional clarity and conciseness. Avoid overly technical jargon unless it is necessary and be mindful of your audience, which consists of your peers who may not be familiar with every detail of your topic. Keep your sentences clear and to the point.
Structure Outline:
In many drafts, the outline of the article structure is missing or only briefly mentioned. It’s important to provide a clear roadmap for your readers so they know what to expect. Spend 1–2 sentences briefly outlining the main sections of your article to help guide your reader through your argument.
Balance of Depth and Breadth:
Be cautious of going too in-depth in the introduction. The goal is to provide a snapshot of the key ideas without delving into the specifics of your argument or findings. The introduction should set the stage without overwhelming the reader with too much detail.
Clarity of Purpose and Opening Statements:
Many students introduced their topics clearly, but some drafts could benefit from a more engaging opening that immediately draws the reader into the significance of the topic. Rather than starting with broad statements, consider highlighting a recent trend or challenge to capture attention. For example, opening with a recent breakthrough in molecular dynamics or a key limitation in computational drug discovery can be more effective than a general introduction.
General Writing Tips
- Avoid long, complex sentences and unnecessary jargon unless needed. Aim for clarity.
- The introduction should not dive too deeply into technical details. Save those for the main body of the paper.
- Each paragraph should lead naturally to the next, maintaining a coherent structure from beginning to end.
Lecture 08
Paper 03 - Reading
Date: Oct 16, 2024
Today's paper: Abramson, J., Adler, J., Dunger, J., Evans, R., Green, T., Pritzel, A., ... & Jumper, J. M. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 1-3. DOI: 10.1038/s41586-024-07487-w
Instructions¶
You will be working in your assigned group to present a key aspect of AlphaFold 3 methodology. The focus should be on understanding how this part of the model works, rather than its history or applications. Make sure to thoroughly explore the specific methodology behind your assigned topic.
Research and Understanding¶
Begin by reading the relevant sections of the AlphaFold 3 paper that correspond to your group’s topic. You are also encouraged to look up supporting references cited in the paper for deeper understanding. As you research, think about why this methodology is important for AlphaFold 3. Be prepared to explain not just how something works, but why it is critical for accurate protein structure prediction.
Creating the Slides¶
Each group has a maximum of 12 minutes for the presentation, plus an additional 3-5 minutes for discussion. Feel free to use the AlphaFold 3 paper and any other cited resources to enhance your understanding. External sources are allowed, but make sure they are reputable and relevant. If you use additional sources or diagrams, please include proper citations on your slides. All members of your group should contribute to the presentation and participate in the delivery. Decide in advance who will present which sections.
Your presentation should cover the following key components:
- Introduction to Your Methodology
- Briefly introduce the specific component of AlphaFold 3 your group is focusing on.
- Step-by-Step Explanation
- Use clear and detailed diagrams to walk through the core methodology.
- Break down complex concepts into smaller, digestible parts.
- Make sure to use relevant terminology and explain any new terms that appear in your presentation.
- Key Technical Insights
- Highlight the critical steps that make this method work effectively in AlphaFold 3.
- Provide examples (e.g., diagrams or charts) to show how the methodology improves prediction accuracy or efficiency.
- Challenges or Limitations
- Discuss any known challenges, limitations, or trade-offs of the method.
- Consider asking questions like: "What happens if this method doesn’t work as expected?" or "What are the limitations in certain cases?"
- Summary and Discussion Questions
- Conclude with a brief summary of your presentation.
- Prepare 1-2 critical thinking questions to ask the class, fostering discussion after your presentation.
Slide Presentation Guidelines¶
- Clarity and Simplicity: Make sure your slides are not overcrowded. Focus on visuals (diagrams, flowcharts) to explain the concepts instead of large blocks of text.
- Explanation over Reading: Do not just read off the slides. Instead, use the slides as a guide to explain the concepts in your own words.
- Engage the Audience: Ensure your explanations are clear and structured in a way that the class can follow, especially since some topics will be technically challenging.
- Practice Timing: Your group should aim for a 10-12 minute presentation to allow time for class discussion afterward.
Leading the Discussion¶
- Prepare Questions: After your presentation, you will lead a brief discussion by asking the class the critical thinking questions you prepared.
- Encourage Participation: Make sure the discussion is interactive. Encourage classmates to share their thoughts and ideas about the methodology and any potential challenges they see.
Rubric¶
Your presentation will be evaluated on the following:
- Understanding of the Topic: How well does your group understand and explain the methodology behind AlphaFold 3?
- Clarity of Explanation: Are the concepts presented clearly and logically? Are diagrams or visuals used effectively?
- Critical Engagement: Does your group encourage the class to think critically about the method, its limitations, and its significance?
- Class Engagement: Are the discussion questions thought-provoking and do they lead to meaningful class participation?
Groups¶
Core Mechanisms in AlphaFold 3
1. Diffusion-Based Architecture in AF3¶
Begin by describing what a diffusion model is in general terms (this could be from research or other areas, like image denoising models). How do these models work by iterating and gradually improving an initial guess? Look into why AlphaFold 3 would adopt this approach. Specifically, how does a diffusion-based model help with protein structure prediction, especially compared to AF2’s approach?
- What does it mean that AF3 operates directly on atomic coordinates? How does this compare with AF2, which used more abstract representations (like torsion angles)?
- What are the advantages of predicting atomic coordinates directly rather than going through intermediate steps?
Research the basic concept of how a diffusion process works. How does AF3 take an initial guess of a protein’s structure and refine it through this iterative process? - Think about what challenges arise during the refinement of protein structures. For example: - How might the initial structure be "noisy" or inaccurate? - How does the diffusion model gradually refine the structure? What’s being corrected step by step?
- Investigate how the diffusion process deals with stereochemical errors. What are these errors, and why are they problematic in protein structure prediction?
- Explore the idea of multiscale diffusion in AF3. What might it mean to refine both local details (such as bond angles) and global features (such as the overall protein fold) at the same time?
Research and compare the traditional methods of structure prediction (such as in AF2) and the diffusion model introduced in AF3. How does the diffusion model simplify or improve upon earlier approaches? Think critically about the trade-offs. Does this approach introduce any new challenges? For instance, are there any computational costs (time, resources, etc.) associated with using a diffusion-based model?
- How might a simpler prediction process (working directly on atomic coordinates) lead to more accurate results? Are there any downsides to cutting out the intermediate steps used in AF2?
- What are the potential computational trade-offs of using diffusion models? Do they require more processing power, time, or data?
2. Multiple Sequence Alignment (MSA) Processing in AF3¶
Begin by researching what MSAs are and why they’re important in protein structure prediction. How do they represent evolutionary relationships between protein sequences? Compare how AlphaFold 2 (AF2) relied heavily on MSAs to extract evolutionary information. Then, explore the shift in AlphaFold 3 (AF3), where there’s a reduced emphasis on MSAs. Why might this change have been introduced?
- What is the core function of an MSA in guiding protein structure predictions? Why did AF2 rely so much on MSA depth (the number of sequences and evolutionary distance between them)?
- Investigate the technical change in AF3—how does AF3 still maintain accuracy while relying less on deep MSAs? Look into how the pairformer module replaces the need for MSA depth in some cases.
Research the reasoning behind AF3’s shift from MSAs. Why was this shift necessary or beneficial? Think about the challenges that arose in AF2 when high-quality evolutionary data was not available. Consider how AF3 compensates for this reduced dependence while still maintaining accurate predictions. What other mechanisms or methods (like the pairformer) does AF3 rely on to predict structure accurately without needing as much evolutionary information?
- Look into the drawbacks of relying on MSAs: What happens when evolutionary data is poor or absent (e.g., for novel or rapidly evolving proteins)?
- Explore the pairformer module in AF3. How does this module allow the model to maintain prediction accuracy without needing a deep evolutionary signal from MSAs?
Reflect on the importance of evolutionary relationships in structure prediction. How does evolutionary conservation help predict protein structure? What insights does MSA data provide, and why has it been a cornerstone of previous methods like AF2? Then, think critically about AF3’s innovation: In cases where evolutionary information is weak or missing, how does AF3 continue to make accurate predictions? What alternatives or compensatory strategies does AF3 employ to make up for missing evolutionary data?
- Investigate how AF3 processes proteins with poor evolutionary information. Are there cases where this new approach might fail or perform less effectively? How does the model balance accuracy without deep evolutionary data?
- Class Discussion Question: "If evolutionary data is limited or absent, should we trust the structure predictions of AF3? How might this limitation affect its performance in predicting novel proteins?"
3. Pairformer Module vs. Evoformer¶
Begin by understanding the role of the evoformer module in AF2. This was a central component that processed multiple sequence alignments (MSAs) and pairwise features to generate structural predictions. Research how the evoformer used both sequence and pairwise data from MSAs to refine protein structure predictions. Then, explore the pairformer module in AF3, which replaces the evoformer. How does this new module simplify the processing of pairwise atomic relationships while de-emphasizing MSA data?
- What was the purpose of the evoformer in AF2? How did it integrate sequence and pairwise features?
- What changes does the pairformer introduce in AF3? How does it handle pairwise atomic representations differently from the evoformer?
- Look into how the reduction in MSA dependence has been balanced by the pairformer’s ability to work more directly with atomic interactions.
Research how the pairformer in AF3 improves computational efficiency compared to the evoformer. Focus on how AF3 simplifies the process of handling complex atomic interactions by focusing more directly on pairwise relationships between atoms, instead of relying on deep evolutionary sequence data. Consider why this change was needed. What bottlenecks existed in AF2 due to its evoformer and reliance on MSA depth? How does the pairformer alleviate these issues?
- Investigate why the pairformer can process pairwise representations more efficiently than the evoformer. What specific techniques does it use to handle the relationships between atoms in a protein structure?
- Look into how AF3 handles complex biomolecular interactions (such as those in protein-ligand or protein-protein systems). How does the pairformer manage these interactions compared to the evoformer?
Reflect on the impact of reducing MSA processing complexity in AF3. The evoformer relied heavily on MSA features to predict structure in AF2, but the pairformer minimizes this dependency. Think about how this change improves efficiency but might also introduce challenges in cases where evolutionary data is limited. Ask: Does reducing the complexity of MSA processing come at the cost of prediction accuracy? Or has AF3 managed to balance the need for evolutionary information in other ways, like more precise atomic pair interactions?
- How does the pairformer’s reduced reliance on MSAs affect accuracy in cases where evolutionary information is rich versus when it’s sparse?
- What are the potential trade-offs between speed and accuracy when simplifying MSA processing? Does the pairformer compensate for the reduced MSA complexity in other ways?
4. Template Search and Embedding in AF3¶
Start by understanding what templates are in the context of protein structure prediction. Templates are known structures (often from the Protein Data Bank, PDB) that serve as a reference when predicting the structure of an unknown protein. In AF2, template use was important when sufficient evolutionary data (from MSAs) was unavailable or unreliable. AF2 searched for structural templates similar to the target protein and used them to guide predictions. Investigate how AF3 modifies this process. How does it search for and embed templates, and how is this process improved over AF2? Look into any changes in template embedding and integration that make AF3 more efficient or accurate.
- Research how AF3 conducts template searches. What databases or resources does it pull from, and what specific features does it look for when identifying a useful template?
- Explore how AF3 embeds these templates into its predictions. What changes in template integration have occurred between AF2 and AF3?
Dive into the situations where AF3 relies on templates. When evolutionary data is strong, AF3 can use MSAs to make accurate predictions. But when this data is weak or incomplete, AF3 needs templates to fill the gap. Consider why template-based predictions are especially useful in some cases. For example, when predicting the structure of a protein with no strong homologs or evolutionary data, AF3 can use a structural template from a similar protein to guide the prediction process.
- Investigate the criteria AF3 uses to decide whether to rely on a template. Are there specific thresholds (e.g., poor MSA data) that trigger template use?
- Research how template-based information is incorporated into the model’s predictions. What specific role do these templates play in shaping the final structure?
Think about the limitations of relying on templates. While templates can help guide the structure of an unknown protein, they also come with risks. For instance, if the template isn’t a close match to the target protein, it could lead to incorrect predictions. Ask: How does AF3 mitigate these issues? Research any strategies or techniques AF3 employs to avoid the pitfalls of template over-reliance.
- Explore cases where templates might introduce biases into the prediction. For instance, if AF3 relies on a poor template, how might that affect the overall structure prediction?
- Investigate how AF3 balances template use with other forms of data, like evolutionary information or direct atomic predictions, to ensure that predictions aren’t overly dependent on templates.
Lecture 09
Field overview editing
Date: Oct 23, 2024
Today, we’ll focus on peer reviewing each other’s Field Overview sections, finalizing presentations, and then delivering them to the class. This session emphasizes the value of constructive feedback to improve both written and oral communication in a supportive setting.
Lecture 10
Paper 04 - Reading
Date: Oct 30, 2024
Today's paper: Buttenschoen, M., Morris, G. M., & Deane, C. M. (2024). PoseBusters: AI-based docking methods fail to generate physically valid poses or generalise to novel sequences. Chemical Science, 15(9), 3130-3139. DOI: 10.1039/D3SC04185A
This class period is dedicated to forming groups, reading and analyzing the PoseBusters paper, and beginning work on your group presentations. Each group will focus on one specific aspect of the study and prepare a structured presentation to be delivered in class next week, following the peer editing of your analysis drafts.
- Forming Groups
- You will divide into five groups, each assigned a unique section of the PoseBusters study to analyze in depth.
- Make sure each group has a balanced mix of strengths, with students who can contribute to various aspects of the presentation, including research, visual design, and oral delivery.
- Reading and Analyzing the Paper
- Once groups are formed, read through the PoseBusters paper together, focusing on the section assigned to your group.
- Discuss and take notes on the methodology, results, and key insights related to your group’s specific topic (e.g., docking protocols, validation criteria, performance comparison, energy minimization, or binding site considerations).
- As you read, pay attention to any supplemental materials (SI) that might provide additional data or visuals to support your presentation.
- Planning and Drafting Your Presentation
- Begin planning your presentation outline using the provided instructions for each group. Decide on key points, visuals, and examples to include for clarity.
- Assign roles within the group, so each member knows which parts of the presentation they’ll focus on, from researching details to designing slides or delivering sections.
- Remember that each presentation should be about 10-12 minutes long, with clear visuals and explanations, engaging the audience in a discussion of your section’s insights.
- Initial Slide Development and Content Organization
- Start creating slides with a logical flow, using visuals to make complex points clear. Refer to charts, diagrams, or data in the SI as needed to enhance your explanations.
- Keep slides concise and visual-based, avoiding text-heavy slides. Instead, focus on using bullet points, flowcharts, and diagrams to illustrate major points.
- Questions and Feedback
- Use this time to ask questions and clarify any part of the paper or your group’s topic. I am here to help with any content, interpretation, or structure questions.
- Review the presentation instructions to ensure your group’s approach aligns with expectations and covers all key points.
Groups¶
Group 1: Docking Protocols and Setup Across Methods¶
Your goal is to introduce the docking protocols used in the PoseBusters study for each method. This will set up a foundational understanding of how each docking method was configured and prepared for accurate comparisons. Emphasize the importance of standardized setups in docking studies and why a consistent initial setup is essential for obtaining meaningful results, especially when comparing different methodologies.
- Brief Overview of Docking Methods
- Instructions: Begin with a concise summary of each docking method: DiffDock, DeepDock, EquiBind, TankBind, Uni-Mol, AutoDock Vina, and CCDC Gold.
- Guidelines:
- Provide a brief description of each method’s approach to docking (e.g., traditional scoring for AutoDock Vina and Gold vs. deep learning in DiffDock).
- Emphasize unique aspects of each method to clarify why certain setups might differ across methods.
- Detailed Protocol and Setup
- Instructions: Focus on the ligand and protein preparation steps used for each docking method.
- Guidelines:
- Describe ligand preparation, such as using RDKit’s ETKDGv3 conformer generation for ligand conformations.
- Detail the protein preparation, including hydrogen addition and binding site specifications (e.g., radius or distance thresholds) tailored for each method.
- Explain any deviations between methods and why these differences are necessary based on each algorithm’s requirements.
- Importance of Consistency in Setup for Comparative Analysis
- Instructions: Emphasize why the PoseBusters study applied a standardized protocol for each method, regardless of whether it was classical or DL-based.
- Guidelines:
- Explain that standardization allows for fairer comparisons, reducing bias introduced by varying preparation steps.
- Discuss the challenges in standardizing these protocols across fundamentally different methodologies (e.g., deep learning models vs. classical scoring-based models).
Wrap up your presentation by explaining that even with a consistent setup, assessing docking outcomes requires rigorous validation criteria to determine how well each method performs.
Group 2: PoseBusters Validation Criteria¶
Your goal is to explain the three core categories of validation criteria used by PoseBusters to evaluate docking predictions. These criteria are essential for ensuring the chemical and physical reliability of docking outputs across different methods. Focus on how these criteria—chemical validity, intramolecular validity, and intermolecular validity—collectively ensure predictions are meaningful and suitable for scientific analysis, ultimately contributing to the concept of PB-validity.
- Overview of the Three Validation Criteria Categories
- Instructions: Introduce the three main categories used in PoseBusters to evaluate docking outcomes.
- Guidelines:
- Begin with a general introduction to validation in docking studies and why rigorous criteria are essential for comparing methods.
- Briefly outline the three categories—chemical, intramolecular, and intermolecular validity—and note that each ensures predictions are accurate and feasible within the constraints of molecular biology and chemistry.
- Slide Tips: Create a clear, labeled diagram or flowchart that displays each validation category, perhaps with visual icons (e.g., chemical bonds for chemical validity, molecular structure for intramolecular validity).
- Detailed Breakdown of Each Category
- Instructions: Dive deeper into each category, explaining what each type of validity checks and why it’s necessary.
- Guidelines:
- Chemical Validity: Cover checks like RDKit sanitization, molecular formula preservation, and stereochemistry (e.g., chirality and double bond configurations).
- Intramolecular Validity: Discuss bond length and angle assessments, focusing on how these checks help prevent unrealistic molecular geometries.
- Intermolecular Validity: Explain checks for steric clashes, including protein-ligand minimum distances and volume overlaps with other molecules like cofactors.
- For each validity category, briefly mention the consequences if predictions fail these checks (e.g., chemically invalid ligands cannot function in biological systems).
- Slide Tips: Use individual slides or sections for each category, accompanied by brief explanations and relevant visuals (e.g., stereochemistry diagrams for chemical validity, bond angles for intramolecular validity).
- The Concept of PB-Validity
- Instructions: Introduce PB-validity (PoseBusters validity) as a cumulative measure of docking reliability.
- Guidelines:
- Explain how PB-validity represents an all-encompassing metric that indicates a pose passes all checks within the three categories, showing it is chemically, intramolecularly, and intermolecularly sound.
- Emphasize the advantage of PB-validity in offering a holistic measure of docking outcome reliability, aiding in the comparison of docking methods.
- Slide Tips: Create a visual representation or checklist showing how each criterion contributes to PB-validity, perhaps using a flowchart that ends in a “PB-validity” box for simplicity.
- Importance of These Validation Steps
- Instructions: Conclude by stressing the importance of these criteria in validating docking methods and ensuring that results are meaningful, particularly when comparing classical and DL-based docking approaches.
- Guidelines:
- Briefly mention how these criteria help address unique challenges posed by deep learning models versus traditional docking approaches.
- Reinforce that without these checks, docking predictions could result in unreliable poses that fail to reflect realistic protein-ligand interactions.
- Slide Tips: Summarize this point on a final slide to reinforce that validation is a critical bridge from protocol to performance, setting up Group 3.
Conclude by emphasizing that validation alone doesn’t reveal which methods perform best; performance outcomes must still be compared to understand the strengths and weaknesses of each approach.
Group 3: Performance Comparison Between Classical and DL-Based Docking Methods¶
Your group’s goal is to compare the docking performance of classical methods (like AutoDock Vina and CCDC Gold) with deep learning (DL)-based methods (such as DiffDock, DeepDock, and EquiBind) using PoseBusters’ RMSD and PB-validity criteria. Highlight the differences in accuracy and reliability between these two types of methods, focusing on how each performs under the validation checks established in PoseBusters.
- Introduction to Performance Metrics (RMSD and PB-Validity)
- Instructions: Start by briefly introducing RMSD (Root Mean Square Deviation) and PB-validity as the main metrics used to assess docking performance in PoseBusters.
- Guidelines:
- Explain RMSD as a measure of the positional accuracy of the predicted ligand pose compared to the experimental (or known) pose, with lower values indicating greater accuracy.
- Introduce PB-validity as a comprehensive measure that indicates a ligand pose meets all physical, chemical, and geometric validity checks.
- Slide Tips: Create a clear introductory slide with concise definitions or icons for RMSD and PB-validity. If helpful, add a small diagram showing the concept of RMSD to aid understanding.
- Comparative Analysis of RMSD and PB-Validity Across Methods
- Instructions: Present a comparative analysis of RMSD and PB-validity scores across classical and DL-based methods.
- Guidelines:
- Highlight how classical methods (AutoDock Vina and CCDC Gold) and DL-based methods (e.g., DiffDock, EquiBind) performed according to RMSD and PB-validity.
- Describe trends: for example, classical methods might have lower RMSD and higher PB-validity, while DL-based methods may have challenges in maintaining validity checks.
- Slide Tips: Use a comparison table, bar chart, or side-by-side visual (such as a waterfall plot from the SI) to display RMSD averages and PB-validity percentages for each method, emphasizing key differences.
- Trends in Valid Pose Counts and Accuracy Differences
- Instructions: Summarize broader trends in the number of valid poses and accuracy differences between classical and DL-based docking methods.
- Guidelines:
- Describe how classical methods generally produced more valid poses, meeting validation checks for geometry and physical feasibility, while DL-based methods struggled in some areas.
- Mention any specific DL methods that performed well or poorly on particular aspects, like bond angle accuracy or stereochemistry.
- Slide Tips: Consider a waterfall plot from the SI that shows valid and invalid pose counts, or create a line graph to show trends in valid pose counts across methods. Label peaks and dips to indicate which methods succeeded or failed in meeting PB-validity.
- Strengths and Weaknesses of DL-Based vs. Classical Docking Approaches
- Instructions: Conclude your performance comparison by discussing the strengths and limitations of DL-based and classical methods as observed in PoseBusters.
- Guidelines:
- Classical Methods: Highlight their robustness in producing PB-valid poses but note they may require more computational resources and time compared to DL-based methods.
- DL-Based Methods: Explain that while DL methods like DiffDock and EquiBind are fast and scalable, they often fall short in chemical and stereochemical accuracy, particularly without extensive validation or adjustments.
- Use specific examples from PoseBusters results to illustrate each point.
- Slide Tips: Create a summary table or chart contrasting DL-based and classical methods’ strengths and weaknesses. Include specific examples from the data to make this comparison more tangible.
Wrap up by noting that while the results highlight performance trends, energy minimization can further impact docking predictions by refining pose accuracy and addressing validity issues.
Group 4: Effects of Energy Minimization on Docked Poses¶
Your objective is to analyze how post-docking energy minimization affects docking validity and prediction accuracy. This includes both the positive impacts (refining docked poses) and potential drawbacks (introducing inaccuracies). By examining specific examples from the PoseBusters study, you’ll show that energy minimization is a double-edged sword, capable of both improving and compromising docking results.
- Introduction to Energy Minimization in Docking
- Instructions: Begin with a brief explanation of what energy minimization is and why it’s used in docking studies.
- Guidelines:
- Define energy minimization as a process that adjusts molecular geometries to reach a stable, low-energy conformation, typically using force fields like AMBER or Sage.
- Explain that energy minimization is intended to refine predicted ligand poses by resolving unrealistic geometries, bond lengths, and angles, thus increasing docking accuracy.
- Slide Tips: Use a simple diagram showing a ligand pose before and after energy minimization, highlighting changes in geometry or energy levels to introduce this concept.
- Role of Force Fields (AMBER, Sage) in Refining Docked Poses
- Instructions: Provide an overview of how force fields like AMBER and Sage contribute to energy minimization.
- Guidelines:
- Explain that these force fields model physical interactions (e.g., bond stretching, angle bending) to produce a realistic conformation by reducing strain.
- Mention any particular challenges associated with using force fields in energy minimization, such as computational expense or dependence on accurate parameterization.
- Discuss why these force fields are specifically used for biological molecules to achieve physically realistic docking poses.
- Slide Tips: Consider using a labeled diagram showing the types of molecular interactions (e.g., bond lengths, angles) that force fields optimize.
- Examples of Energy Minimization Effects: Refinement vs. Distortion
- Instructions: Present examples from the PoseBusters study showing both positive and negative impacts of energy minimization on docking results.
- Guidelines:
- Show examples where energy minimization successfully corrected geometric errors in DL-based poses, such as unrealistic bond angles or steric clashes.
- Highlight cases where energy minimization “destroyed” previously valid poses, introducing distortions that moved them away from realistic conformations.
- Emphasize that while energy minimization can correct certain errors, it’s not a guaranteed solution and can sometimes compromise docking accuracy.
- Slide Tips: Use before-and-after visuals from the PoseBusters study, such as specific ligand conformations pre- and post-minimization, to illustrate changes.
- Visual Data Analysis: Mixed Results of Energy Minimization
- Instructions: Use visual data to illustrate the mixed outcomes of energy minimization on docking validity.
- Guidelines:
- Present data from the PoseBusters SI, such as percentage of poses passing validation checks after minimization, to show the variability in results.
- Use waterfall or bar plots to highlight trends (e.g., increased validation for some methods, decreased for others), underscoring the inconsistency of energy minimization’s effects.
- Slide Tips: Include charts or waterfall plots from the SI showing performance metrics before and after energy minimization, emphasizing changes in PB-validity.
- Complexity of Energy Minimization in Docking
- Instructions: Conclude by discussing why energy minimization is a complex and nuanced process that can both help and hinder docking outcomes.
- Guidelines:
- Mention how energy minimization may cause oversimplification of molecular interactions, sometimes ignoring critical details needed for accurate docking.
- Discuss the challenge of balancing energy minimization’s corrective power with its potential to introduce artifacts or over-corrections, particularly in deep learning-based docking.
- Reinforce that while energy minimization can improve docking validity, it’s not always appropriate or effective across all docking methods.
- Slide Tips: Create a summary slide with bullet points listing both the benefits and drawbacks of energy minimization, giving the audience a balanced view.
Conclude by mentioning that energy minimization’s effectiveness can vary further in cases involving cofactors or complex binding sites, which present additional challenges to docking accuracy.
Lecture 11
Analysis editing
Date: Nov 6, 2024
Today, we’ll focus on peer reviewing each other’s Analysis sections, finalizing presentations, and then delivering them to the class. This session emphasizes the value of constructive feedback to improve both written and oral communication in a supportive setting.
Lecture 12
Writing Workshop
Date: Nov 13, 2024
Today's workshop is designed to help you make significant progress on your perspective paper. You'll have access to focused support, resources, and peer feedback to strengthen your draft.
Draft Revision Planning Guide¶
Introduction Assessment¶
What specific areas need work?
- Opening grabs attention without being too broad
- Background funnel moves logically from broad to specific
- Clearly identifies specific gaps/challenges in the field
- States your perspective explicitly (not just describing the topic)
- Provides clear roadmap of paper without just listing sections
Field Overview Assessment¶
What specific areas need work?
- Uses assertion statements as headings (not just topic labels)
- Synthesizes literature (not just summarizing papers one by one)
- Shows clear connections between different studies
- Identifies patterns or trends across the literature
- Maintains focus on points relevant to your perspective
- Properly distinguishes established facts from interpretations
Analysis Assessment¶
What specific areas need work?
- Arguments clearly support your stated perspective
- Each paragraph makes one clear point
- Claims are supported by specific evidence
- Counterarguments are acknowledged and addressed
- Technical terms are explained appropriately
- Figures/tables are integrated effectively into the text
Future Directions Assessment¶
What specific areas need work?
- Suggests specific, feasible next steps
- Connects clearly to your perspective
- Identifies concrete research questions
- Discusses potential methodological approaches
- Avoids overly general statements
Citation Check¶
List citations needing review.
- All facts have appropriate citations
- Citations appear before periods
- Reference list is complete and in APA format
- Citations are from peer-reviewed sources
- No overcitation of basic concepts
Specific Questions for Instructor¶
List specific questions about your draft for your instructor.
Areas for Peer Review¶
What specific aspects of your paper do you want peers to examine?
Writing Workshop¶
During this time, you'll have access to four support stations. You can move between stations based on your needs:
Station 1: Core Content Development¶
Get help with:
- Refining your introduction's background funnel
- Strengthening your field overview
- Creating effective assertion-based headings
- Synthesizing literature effectively
Station 2: Analysis & Future Directions¶
Receive guidance on:
- Developing your perspective arguments
- Writing the new Future Directions section
- Crafting an impactful conclusion
- Connecting your ideas to broader implications
Station 3: Technical Support¶
Get assistance with:
- APA citation formatting
- Figure and table placement
- Page layout requirements
- Writing style refinement
Station 4: Peer Review Corner¶
Work with classmates to:
- Exchange focused feedback
- Test the clarity of your arguments
- Improve flow between sections
- Strengthen evidence presentation
Progress Review¶
We'll wrap up by:
- Sharing progress made during the workshop
- Identifying next steps for completion
- Creating specific action plans for final revisions
- Addressing any remaining questions
Key Elements to Address Today¶
Consider these points as you work:
✓ Is your perspective clear and well-supported?
✓ Does your evidence effectively support your arguments?
✓ Are your Future Directions logical and well-connected to your analysis?
✓ Do your sections flow smoothly with clear transitions?
✓ Is your writing clear, concise, and academically appropriate?
✓ Are all technical elements (citations, formatting) correct?
Lecture 13
Peer Review Workshop
Date: Nov 20, 2024
This workshop is designed to help you get targeted feedback on your perspective paper and create a concrete plan for final revisions. If you are not planning to submit a final draft, you may work quietly on other coursework.
Peer Review Session¶
Each paper will be reviewed using the following process:
- Silent Reading
- Reviewers read the paper
- Use provided feedback form to track observations
- Mark specific passages that need attention
-
Written Feedback Complete the structured feedback form.
Argument Clarity Is the main perspective clearly stated? Where? Are the arguments well-developed and supported? Note any points that need clarification
Evidence Support Are claims backed by appropriate citations? Is evidence used effectively? Identify any assertions needing support
Logical Flow Does the paper progress logically? Are transitions effective? Mark any disconnected sections
Technical Elements Are technical terms properly explained? Are figures/tables effectively used? Note any formatting issues
Writing Style Is the tone appropriately academic? Are sentences clear and varied? Mark any awkward phrasings
-
Verbal Discussion Each reviewer shares: One major strength Two specific suggestions for improvement One question for the author
Feedback Exchange Guidelines¶
- Be specific and constructive
- Provide examples where possible
- Focus on substantive improvements
- Reference rubric criteria
- Ask clarifying questions
Revision Planning¶
Organizing Feedback¶
Create a revision matrix:
Section | Feedback Received | Action Items | Priority | Resources Needed |
---|---|---|---|---|
Intro | ||||
Field Overview | ||||
Analysis | ||||
Future Directions | ||||
Conclusion |
Individual Planning¶
- Categorize Feedback
- Must address (affects core argument)
- Should address (improves clarity/flow)
- Could address (minor improvements)
- Create Timeline
- List specific revision tasks
- Estimate time needed for each
- Set internal deadlines
- Plan review/editing time
- Identify Resources
- Additional sources needed
- Writing center appointments
- Office hours visits
- Style guides
Lecture 14
Semester Wrap-Up
Date: Dec 4, 2024
As we gather for our final session, we'll create a supportive space to reflect on our semester-long journey through computational biology and look ahead to future paths. This session is designed to be informal and conversational, allowing everyone to share their experiences and thoughts about the road ahead.
Research Reflections¶
We'll begin by forming small discussion circles of 4-5 students. This setting allows for meaningful conversation about your research experiences. Each person will have about three minutes to share their journey - think of this as a casual conversation rather than a presentation. You might discuss what fascinated you most about your topic, moments that challenged your assumptions, or how your understanding evolved throughout the semester.
Consider sharing:
- An unexpected discovery that shifted your perspective
- A challenging moment that led to growth
- How your understanding of computational biology has evolved
- What you're most proud of accomplishing
Remember, this is a safe space to discuss both successes and struggles. There's no pressure to present formally - we're here to learn from each other's experiences.
Looking to the Future¶
In the second part of our session, we'll transition to discussing future aspirations and concerns. We'll break into groups based on shared interests, whether that's graduate school, industry positions, research careers, or still exploring options. This is an opportunity to voice both your excitement and anxieties about the path ahead.
Share openly about:
- What excites you about your chosen path
- What keeps you up at night
- Uncertainties you're grappling with
- Support you're seeking
- Resources you've found helpful
For those considering graduate school:
- What programs interest you?
- What aspects of advanced study worry you?
- How do you feel about the application process?
- What support would be most helpful?
- What questions drive your curiosity?
- What aspects of research feel daunting?
For those eyeing industry:
- Which sectors intrigue you?
- What skills make you nervous?
- How do you feel about job searching?
- What networking opportunities interest you?
For those still exploring:
- What options are you considering?
- What factors influence your decision?
- What information would help you decide?
- How can we support your exploration?
Final Thoughts¶
Before we part, we'll take time to acknowledge everyone's accomplishments and share final words of encouragement. This is also a chance to complete course evaluations and ask any lingering questions about your future paths.
Remember, this session is about supporting each other as we look to the future. Share honestly, listen empathetically, and celebrate how far you've come. Your anxieties and uncertainties are normal and shared by many of your peers - voicing them often helps us find common ground and mutual support.
Ended: Lectures
Assessments ↵
Assessments¶
TODO:
Pre-class ↵
Pre-class assignments¶
L03 pre-class assignment
Please read the selected paper below and answer the following questions in your own words. I am only looking for a few sentences per question. Write your answers in your favorite Word processing software and submit them as a PDF to gradescope.
Paper: Champion, C., Gall, R., Ries, B., Rieder, S. R., Barros, E. P., & Riniker, S. (2023). Accelerating Alchemical Free Energy Prediction Using a Multistate Method: Application to Multiple Kinases. Journal of Chemical Information and Modeling, 63(22), 7133-7147. DOI: 10.1021/acs.jcim.3c01469
Tip
Do not get too bogged down by the methodology. Your instructor will go over the following topics in the respective lecture.
- Molecular Dynamics (MD) Simulations:
- Basic principle: computer simulation of physical movements of atoms and molecules
- Applications in computational biology and drug discovery
- Force Fields:
- Definition: set of parameters used to calculate the potential energy of a system of atoms
- Examples mentioned in the paper: GAFF and OpenFF
- Free Energy Calculations:
- Importance in drug discovery (predicting binding affinities)
- Brief overview of traditional methods like Free Energy Perturbation (FEP) and Thermodynamic Integration (TI)
- Alchemical Transformations:
- Concept of "morphing" one molecule into another in silico
- Why it's useful for calculating relative binding free energies
- Sampling in MD Simulations:
- Importance of exploring conformational space
- Challenges with traditional methods (e.g., getting "stuck" in local energy minima)
- Enhanced Sampling Techniques:
- General concept of improving exploration of conformational space
- Brief mention of replica exchange as an example
Q01¶
What is the main goal or purpose of this study?
Q02¶
The authors applied RE-EDS to four different kinase systems. List these kinases and briefly describe the types of ligand modifications studied for each.
Q03¶
What two small molecule force fields were used in this study? Why do you think the authors chose to use multiple force fields?
Q04¶
Briefly explain the concept of "hybrid topology" used in this paper. How does it differ from "dual topology"?
Q05¶
What metric did the authors use to assess the accuracy of their binding free energy calculations? What value is considered the threshold for "chemical accuracy"?
Q06¶
For which kinase system did RE-EDS perform best? For which did it perform worst? What factors may have contributed to these differences?
Q07¶
What were some of the key limitations or challenges identified for the RE-EDS method?
Q08¶
Based on the results, what are the potential applications or benefits of using RE-EDS for drug discovery?
Q09¶
What questions do you have about the paper?
L05 pre-class assignment¶
Please lightly read the selected paper below and answer the following questions in your own words. I am only looking for a few sentences per question. Write your answers in your favorite Word processing software and submit them as a PDF to Canvas.
Paper: Zhu, W., Zhang, Y., Zhao, D., Xu, J., & Wang, L. (2023). HiGNN: A Hierarchical Informative Graph Neural Network for Molecular Property Prediction Equipped with Feature-Wise Attention. Journal of Chemical Information and Modeling, 63(1), 43-55. DOI: 10.1021/acs.jcim.2c01099
Q01¶
What is the main goal or purpose of this study?
Q02¶
Explain what a Graph Neural Network is and why it's useful for molecular property prediction.
Q03¶
The authors proposed a new model called HiGNN. What does HiGNN stand for, and what are its two main components?
Q04¶
What metric(s) did the authors use to assess the accuracy of their molecular property predictions? How did HiGNN perform compared to other models?
Q05¶
The study evaluated HiGNN on several benchmark datasets. Name three of these datasets and briefly describe what properties they represent.
Q06¶
What questions do you have about the paper?
L09 pre-class assignment¶
Paper: TODO:
L11 pre-class assignment¶
Paper: TODO:
Ended: Pre-class
Perspective paper ↵
Perspective paper¶
In this course, you will write a perspective article on a critical debate in computational biology. This assignment aims to develop your ability to analyze complex scientific issues, form well-reasoned arguments, and articulate your viewpoint supported by current research. You will:
- Choose one of the provided perspective primers, each presenting a nuanced question in computational biology.
- Write a detailed, well-supported perspective on your chosen topic (5-7 pages, excluding cover page, references, tables, and figures).
Objective¶
Your task is to research your chosen topic thoroughly, present a balanced argument considering multiple viewpoints, and provide a well-reasoned perspective supported by current scientific evidence. This paper should demonstrate your critical thinking skills and ability to synthesize complex information in computational biology.
Key Components¶
While you have flexibility in structuring your paper, consider including these elements:
- Introduction: Set the context, highlight the topic's significance, and present your main argument.
- Background: Provide necessary scientific context and key concepts.
- Current State of Research: Summarize relevant findings and identify trends or conflicts in the literature.
- Your Perspective: Present your viewpoint, supported by evidence from your research.
- Implications and Future Directions: Discuss the potential impact of your perspective and suggest areas for further research.
- Conclusion: Synthesize your main points and reinforce your stance.
Remember to maintain a scholarly tone while engaging your audience of peers and instructors who have foundational knowledge of computational biology.
Evaluation Criteria¶
Your paper will be evaluated based on:
- Depth of research and understanding of the topic
- Quality and relevance of supporting evidence
- Clarity and coherence of your argument
- Critical analysis and original thinking
- Adherence to academic writing standards and provided guidelines
We encourage you to approach this assignment as an opportunity to contribute meaningfully to ongoing debates in computational biology. Good luck with your research and writing!
Perspective primers¶
You must choose one of the following perspective primers and write a detailed, well-supported perspective on the topic. Each primer presents a nuanced question with no clear right or wrong answer, encouraging you to explore the literature, form your own opinion, and justify your stance. Your task is to research your chosen topic thoroughly, present a balanced argument, and provide a well-reasoned perspective supported by current scientific evidence.
Protein structure prediction¶
Primer: Are ab initio protein structure prediction algorithms still relevant in the deep learning era?
Ab initio protein structure prediction algorithms can determine the three-dimensional structures of proteins from their amino acid sequences without relying on homologous structures. These methods often involve intensive computational processes and can be time-consuming. However, recent advances in deep learning, exemplified by tools like AlphaFold, have dramatically improved the accuracy and efficiency of protein structure predictions, challenging the relevance of traditional ab initio approaches.
In the era of deep learning, especially with sophisticated models that leverage vast amounts of data and computational power, protein structure prediction has seen unprecedented advancements. The question arises: do ab initio methods still hold value, or have they been rendered obsolete by these newer, data-driven approaches? This perspective should touch on the balance between traditional algorithmic approaches and cutting-edge machine learning techniques and their implications for the future of computational structural biology.
Possible discussion points:
- Accuracy and Reliability: Compare the accuracy and reliability of ab initio methods with deep learning-based predictions. Evaluate situations where one method may outperform the other.
- Computational Resources: Assess the computational demands of ab initio methods versus deep learning models, considering accessibility for different research institutions.
- Data Dependence: Discuss the dependence of deep learning models on large datasets and the potential limitations this may impose compared to ab initio methods, which do not exclusively rely on prior data.
- Innovation and Integration: Explore how traditional ab initio methods can be integrated with deep learning approaches to enhance prediction accuracy and reliability.
- Case Studies: Examine specific case studies where ab initio methods have provided unique insights or deep learning models have significantly outperformed traditional approaches.
- Future Prospects: Consider the future of protein structure prediction, including potential advancements in ab initio and deep learning methods and their implications for the field.
One can argue for the continued relevance of ab initio methods based on their foundational principles, independence from large training datasets, and potential for integration with new technologies. Conversely, others may emphasize deep learning's transformative impact, highlighting its superior accuracy, efficiency, and the paradigm shift in the field.
Example papers
Here are some scientific articles to help get you started.
Primary
- Abramson, J., Adler, J., Dunger, J., Evans, R., Green, T., Pritzel, A., ... & Jumper, J. M. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold3. Nature, 1-3. DOI: 10.1038/s41586-024-07487-w
- Baek, M., DiMaio, F., Anishchenko, I., Dauparas, J., Ovchinnikov, S., Lee, G. R., ... & Baker, D. (2021). Accurate prediction of protein structures and interactions using a three-track neural network. Science, 373(6557), 871-876. DOI: 10.1126/science.abj8754
- Zhou, X., Zheng, W., Li, Y., Pearce, R., Zhang, C., Bell, E. W., ... & Zhang, Y. (2022). I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nature Protocols, 17(10), 2326-2353. DOI: 10.1038/s41596-022-00728-0
Opinion
- Outeiral, C., Nissley, D. A., & Deane, C. M. (2022). Current structure predictors are not learning the physics of protein folding. Bioinformatics, 38(7), 1881-1887. DOI: 10.1093/bioinformatics/btab881
- Kuhlman, B., & Bradley, P. (2019). Advances in protein structure prediction and design. Nature reviews molecular cell biology, 20(11), 681-697. DOI: 10.1038/s41580-019-0163-x
- Doga, H., Raubenolt, B., Cumbo, F., Joshi, J., DiFilippo, F. P., Qin, J., ... & Shehab, O. (2024). A perspective on protein structure prediction using quantum computers. Journal of Chemical Theory and Computation, 20(9), 3359-3378. DOI: 10.1021/acs.jctc.4c00067
Reviews
- Huang, B., Kong, L., Wang, C., Ju, F., Zhang, Q., Zhu, J., ... & Bu, D. (2023). Protein structure prediction: challenges, advances, and the shift of research paradigms. Genomics, Proteomics & Bioinformatics, 21(5), 913-925. DOI: 10.1016/j.gpb.2022.11.014
- Bertoline, L. M., Lima, A. N., Krieger, J. E., & Teixeira, S. K. (2023). Before and after AlphaFold2: An overview of protein structure prediction. Frontiers in bioinformatics, 3, 1120370. DOI: 10.3389/fbinf.2023.1120370
Computer-aided drug design¶
Primer: Are molecular dynamics simulations overhyped in drug discovery, or do they provide indispensable insights?
Molecular dynamics (MD) simulations allow researchers to observe the behavior of molecules over time, offering detailed insights into the dynamic nature of protein-ligand interactions. This technique is often used after initial docking studies to refine and validate the predicted interactions. However, MD simulations are computationally intensive and require significant expertise to interpret.
MD simulations are typically performed after initial docking studies in the drug development pipeline to validate and refine the predicted protein-ligand interactions. The question arises: Should researchers invest in computationally expensive and time-consuming MD simulations or proceed directly to wet-lab experiments, which might provide more definitive answers? This decision point is critical, as it impacts the drug development process's efficiency, accuracy, and cost.
Possible discussion points:
- Accuracy and Precision: Debate the accuracy of MD simulations in predicting real-world molecular interactions compared to static docking models.
- Computational Resources: Consider the computational costs and accessibility of MD simulations for different research institutions.
- Predictive Value: Evaluate how MD simulations can refine docking results and their impact on predicting binding affinities and interaction stability.
- Experimental Validation: Discuss whether the insights gained from MD simulations justify the delay and resources compared to proceeding directly to wet lab experiments after docking.
- Case Studies: Examine specific case studies in which MD simulations have either provided critical insights or been unnecessary in the drug design process.
- Future Prospects: Discuss potential advancements in MD technology and their implications for future drug design, considering both the benefits and limitations.
MD simulations could be indispensable because they can provide detailed dynamic insights and refine docking predictions, enhancing the reliability of subsequent wet lab experiments. Conversely, others might highlight the practical challenges, such as the computational expense and the potential delays in the drug development timeline, advocating for a more streamlined approach that moves directly from docking to experimental validation.
Example papers
Here are some scientific articles to help get you started.
Primary
- Alibay, I., Magarkar, A., Seeliger, D., & Biggin, P. C. (2022). Evaluating the use of absolute binding free energy in the fragment optimisation process. Communications Chemistry, 5(1), 105. DOI: 10.1038/s42004-022-00721-4
- Eberhardt, J., Santos-Martins, D., Tillack, A. F., & Forli, S. (2021). AutoDock Vina 1.2.0: New docking methods, expanded force field, and python bindings. Journal of chemical information and modeling, 61(8), 3891-3898. DOI: 10.1021/acs.jcim.1c00203
- Wan, S., Sinclair, R. C., & Coveney, P. V. (2021). Uncertainty quantification in classical molecular dynamics. Philosophical Transactions of the Royal Society A, 379(2197), 20200082. DOI: 10.1098/rsta.2020.0082
- Sahakyan, H. (2021). Improving virtual screening results with MM/GBSA and MM/PBSA rescoring. Journal of Computer-Aided Molecular Design, 35(6), 731-736. DOI: 10.1007/s10822-021-00389-3
- Lee, T. S., Lin, Z., Allen, B. K., Lin, C., Radak, B. K., Tao, Y., ... & York, D. M. (2020). Improved alchemical free energy calculations with optimized smoothstep softcore potentials. Journal of chemical theory and computation, 16(9), 5512-5525. DOI: 10.1021/acs.jctc.0c00237
Opinion
- Song, L. F., & Merz Jr, K. M. (2020). Evolution of alchemical free energy methods in drug discovery. Journal of Chemical Information and Modeling, 60(11), 5308-5318. DOI: 10.1021/acs.jcim.0c00547
Reviews
- Sabe, V. T., Ntombela, T., Jhamba, L. A., Maguire, G. E., Govender, T., Naicker, T., & Kruger, H. G. (2021). Current trends in computer aided drug design and a highlight of drugs discovered via computational techniques: A review. European Journal of Medicinal Chemistry, 224, 113705. DOI: 10.1016/j.ejmech.2021.113705
- Bassani, D., & Moro, S. (2023). Past, present, and future perspectives on computer-aided drug design methodologies. Molecules, 28(9), 3906. DOI: 10.3390/molecules28093906
- Yang, C., Chen, E. A., & Zhang, Y. (2022). Protein–ligand docking in the machine-learning era. Molecules, 27(14), 4568. DOI: 10.3390/molecules27144568
- Dhakal, A., McKay, C., Tanner, J. J., & Cheng, J. (2022). Artificial intelligence in the prediction of protein–ligand interactions: recent advances and future directions. Briefings in Bioinformatics, 23(1), bbab476. DOI: 10.1093/bib/bbab476
- Sadybekov, A. V., & Katritch, V. (2023). Computational approaches streamlining drug discovery. Nature, 616(7958), 673-685. DOI: 10.1038/s41586-023-05905-z
Forming your perspective¶
As a student, writing a perspective article can seem daunting, especially when you're still learning the field. By focusing on a specific article, you can develop a well-grounded perspective that demonstrates your ability to engage deeply with current research, think critically about challenges, and propose thoughtful ideas for future work. Follow this framework to help you get started.
Deep reading of the focal article¶
Begin by immersing yourself in the assigned focus article. Read it multiple times, each time with a different purpose:
- First read: Grasp the main argument and findings.
- Second read: Analyze the methods, results, and conclusions in detail.
- Third read: Critically evaluate the article's strengths, limitations, and implications.
Pay particular attention to:
- The "Introduction" to understand the context and motivation for the study;
- The "Methods" section to grasp the approaches being used;
- The "Results" to understand the key findings;
- The "Discussion" section, where authors interpret their results and place them in a broader context;
- The "Future Directions" or "Conclusions" sections, where researchers often highlight unanswered questions or areas needing further investigation.
Take detailed notes as you read, focusing on the main arguments, methodologies, and unresolved questions in the article.
Contextual analysis¶
Examine the article's citations to understand its theoretical and empirical foundations:
- Read key papers cited in the introduction and discussion sections.
- Investigate more recent papers that have cited this article to see how the field has evolved.
This will help you place the focus article within the broader context of the field.
Identify key points¶
Create a list of the article's most important elements:
- What is the main problem or question the article addresses?
- What are the key methods or approaches used?
- What are the most significant findings or conclusions?
- What limitations or future directions does the article suggest?
Critical evaluation¶
Assess the strengths and weaknesses of the study's approach:
- Consider alternative interpretations of the results.
- Reflect on how well the conclusions are supported by the data.
- Think about potential limitations that weren't addressed in the article.
Develop your perspective¶
Based on your analysis, start forming your own views. Your perspective could:
- Extend the article's approach to a new problem or dataset.
- Propose modifications to the method to address identified limitations.
- Suggest how the findings could be applied in a different context.
- Argue for or against the article's conclusions based on other literature.
- Identify gaps or unanswered questions raised by the article.
Remember, your perspective doesn't need to be entirely novel. It could be a synthesis of the focus article's ideas applied in a new way, or a call for more research in a particular direction. The key is to support your viewpoint with evidence from the focus article and related literature, and to clearly articulate why your perspective is important for the field.
Support your perspective¶
Use evidence from the focal article and related literature to support your viewpoint. Explain how your perspective builds on, challenges, or extends the article's work.
Consider broader implications¶
Reflect on how your perspective contributes to the wider field of computational biology. Discuss potential practical applications or theoretical advancements that could result from your perspective.
Formulate research questions¶
Based on your perspective, propose specific research questions or hypotheses that could be investigated in future studies.
As you develop your perspective, continually refer back to the focus article and your notes. Ensure that your viewpoint is grounded in the current state of the field and addresses real, recognized challenges or opportunities. Don't be afraid to discuss your emerging ideas with your professors or peers - these conversations can help refine and strengthen your perspective.
Guidelines ↵
Guidelines¶
This outlines the structure and requirements for writing a perspective article in computational biology. You will be tasked with crafting a thoughtful, well-researched piece that demonstrates your understanding of a specific topic within the field. Your article should be 5-7 pages long (excluding the cover page, references, tables, and figures) following the typesetting format requirements. The topic must be based on one of the provided themes and associated papers discussed in class.
Components ↵
Components¶
Your perspective article will consist of several key sections, each serving a distinct purpose in presenting your analysis and insights. By following this structure, you will create a comprehensive and cohesive article that not only contributes meaningfully to the discourse in computational biology but also values and integrates the perspectives of your peers.
Remember, while the format provides a framework, the strength of your article lies in your ability to synthesize information, think critically about the subject matter, and present a well-reasoned perspective on your chosen topic. As you write, consider your audience your peers, who may have foundational knowledge of computational biology but may not be experts in your topic.
1. Introduction¶
Purpose: To engage your reader, provide context, and clearly state your perspective.
Key components:
- Opening statement: Capture attention with a thought-provoking fact, question, or statement related to your topic.
- Background funnel: Start broad and gradually narrow down to your specific focus area.
- Current gaps: Identify the limitations or unanswered questions in the current research.
- Your perspective (thesis): Clearly state your main argument or hypothesis.
- Article structure: Briefly outline the main sections of your paper.
Advice:
- Keep your introduction concise but informative.
- Ensure your thesis is clear, specific, and debatable.
- Use language that is accessible to a broad audience within computational biology.
2. Field Overview¶
Purpose: To provide a comprehensive background of your topic, demonstrating your understanding of the field.
Key components:
- Key concepts: Define and explain fundamental terminology and ideas.
- Current state of the field: Summarize recent breakthroughs and ongoing research.
- Major debates: Present different viewpoints objectively.
- Relevance and potential impacts: Discuss why your topic matters in the broader context of computational biology.
Advice:
- Use a mix of seminal works and recent publications to show the evolution of ideas.
- Organize this section with clear subheadings for easy navigation.
- Maintain an objective tone when presenting different viewpoints.
3. Your Perspective (Core Argument)¶
Purpose: To present and defend your unique perspective on the topic.
Key components:
- Clearly restate your main argument or hypothesis.
- Present evidence supporting your perspective.
- Address potential counterarguments.
- Discuss the implications of your perspective.
Advice:
- Use a logical structure to build your argument.
- Support each point with evidence from your research.
- Anticipate and address potential criticisms or alternative viewpoints.
- Explain how your perspective advances the field or offers new insights.
4. Case Studies or Examples¶
Purpose: To illustrate your perspective with concrete, real-world applications.
Key components:
- Select 2-3 relevant case studies or examples.
- Describe each case study in detail.
- Analyze how each example supports your perspective.
- Connect the examples back to your main argument.
Advice:
- Choose diverse examples to show the breadth of your perspective's applicability.
- Be specific in your analysis, avoiding vague generalizations.
- Explain why these particular examples are significant.
5. Future Directions¶
Purpose: To demonstrate the potential impact and longevity of your perspective.
Key components:
- Discuss potential future research based on your perspective.
- Identify open questions or challenges in the field.
- Propose potential solutions or approaches.
Advice:
- Be speculative but ground your ideas in current research and trends.
- Consider both short-term and long-term implications.
- Discuss how your perspective could shape future work in the field.
6. Conclusion¶
Purpose: To summarize your argument and leave a lasting impression on the reader.
Key components:
- Summarize key points from each section.
- Restate your main argument.
- Emphasize the significance of your perspective.
- End with a thought-provoking statement or call to action.
Advice:
- Avoid introducing new information in the conclusion.
- Reinforce the importance of your perspective in the broader context of computational biology.
- Leave the reader with something to ponder or a clear next step for the field.
7. References¶
Purpose: To give credit to your sources and provide a resource for further reading.
Advice:
- Follow APA format consistently.
- Include a mix of primary research articles, review papers, and books.
- Ensure all in-text citations correspond to entries in your reference list.
Cover Page¶
The cover page should present the title, your name, course title, and the date of the paper.
Introduction¶
Crafting the introduction for your scientific perspective article is a fundamental step in engaging your readers and laying a solid foundation for your arguments. This section serves multiple essential functions, each contributing to a compelling and coherent opening that sets the stage for your entire piece.
Purpose of the Introduction¶
The introduction of your scientific perspective article plays a pivotal role in establishing the context and significance of your work within the broader field of computational biology. Setting the context involves providing the necessary background information to situate your perspective within the field's current landscape. Think of this as creating a map for your readers, highlighting critical areas of research and development. By doing so, you clarify where your ideas fit—whether you are supporting, expanding, or challenging existing viewpoints. For example, suppose your article focuses on machine learning in genomics. In that case, you might outline the evolution of computational methods in genomic analysis and pinpoint where machine learning has recently become pivotal.
Highlighting the significance of your topic is equally important. Here, you address the "so what?" question, demonstrating why your perspective matters. Explain how your topic relates to current challenges in the field, its potential impact on future research, or its implications for practical applications in areas like medicine or biotechnology. For instance, you might state, "Understanding the functional impact of genetic variations is crucial for personalized medicine, yet current computational methods fall short in accuracy and scalability." This underscores the relevance of your work and engages readers by showing them the importance of your perspective.
Additionally, the introduction should provide a concise overview of the current state of research. This snapshot includes key findings, methodologies, and ongoing debates within your specific area, establishing a foundation for your perspective. Keeping this section brief and focused is important, as well as avoiding an exhaustive literature review. For example, you might mention, "Recent advancements in single-cell RNA sequencing have unveiled unprecedented cellular heterogeneity, yet integrating this data with existing models remains challenging."
One of the most crucial aspects of your introduction is presenting your main argument or viewpoint. Clearly state your thesis—the core idea or perspective your article will develop. This declaration sets expectations and clearly explains what to expect in the rest of your article. For instance, "This article argues that integrating artificial intelligence with traditional computational models can significantly enhance the accuracy of protein structure predictions." This statement introduces your main argument and signals the direction your article will take.
Finally, the introduction should outline the structure of your article, acting as a roadmap for your readers. By briefly mentioning the main sections or key points you will cover, you guide your audience through organizing your arguments. This ensures a logical flow and helps readers follow your reasoning more easily. An example might be, "Following this introduction, the article will review current AI applications in genomics, discuss the limitations of existing models, present the proposed hybrid AI framework, and explore its potential implications for future research and clinical practice."
Detailed Instructions for Crafting Your Introduction¶
Cite Key Literature¶
Demonstrating your field knowledge through references is essential for establishing credibility and providing context for your perspective. Begin by citing seminal papers that have shaped your area of research. These foundational studies provide a historical backdrop and highlight the evolution of key concepts. Next, include recent breakthrough studies that showcase the latest advancements and current trends in computational biology. Additionally, cite works that directly support or contrast with your perspective to present a balanced view and acknowledge different viewpoints within the field. Use these citations to demonstrate field knowledge and provide context, ensuring that your introduction reflects a well-rounded understanding of the topic. However, be selective; focus on the most relevant and impactful works to maintain clarity and conciseness. For example, "Smith et al. (2020) pioneered the use of deep learning in genomic data analysis, while recent studies by Lee and Kim (2023) have expanded its applications to personalized medicine."
Crafting an Effective Opening Statement¶
Optimal length: 1 to 3 sentences
The opening statement is your first opportunity to capture your reader's attention and set the tone for your article. Aim for optimal length of one to two sentences that are both engaging and informative. Your opening should immediately engage your audience, establish the context and relevance of your topic, and set the tone for your perspective. To achieve this, consider the following approaches:
- Highlight a Significant Recent Trend or Advancement: For example, "The advent of single-cell RNA sequencing has revolutionized our understanding of cellular diversity in complex tissues."
- Describe a Current Limitation or Challenge: For instance, "Despite the abundance of genomic data, accurately interpreting the functional consequences of genetic variations remains a significant hurdle in personalized medicine."
- Summarize the State of Research in Your Focus Area: Such as, "Integrating machine learning techniques with structural biology has opened new avenues for predicting protein structures and designing targeted drugs."
Avoid overly broad statements, gimmicky hooks, and clichés that can make your opening seem disconnected or unoriginal. Instead, use clear and precise language appropriate for an undergraduate audience, and define any specialized terms if necessary to ensure accessibility. Conclude your opening with a subtle hook that encourages readers to continue, seamlessly transitioning into the rest of your introduction.
Developing a Background Funnel¶
Optimal length: 8 to 10 sentences
The background funnel is a critical component that bridges your opening statement to the main body of your work. Spanning six to eight sentences, it provides essential context by starting with broad concepts and gradually narrowing down to your specific research focus. Begin with a general statement about relevant computational biology topics to set the stage within the larger field. For example, "Computational biology has become integral to modern biological research, enabling the analysis of vast datasets with unprecedented precision."
From there, gradually narrow your focus towards more specific aspects of your subfield. Incorporate a brief history of your topic, highlighting key discoveries or milestones that have shaped current research. This historical context not only educates your audience but also demonstrates the evolving nature of the field. For instance, "Early applications of machine learning in genomics focused on pattern recognition, but recent advancements have allowed for more complex predictive capabilities."
Throughout this section, explain critical concepts necessary for understanding your perspective. Use accessible language suitable for an undergraduate audience, and define specialized jargon to maintain inclusivity. For example, "Deep learning, a subset of machine learning, utilizes neural networks with multiple layers to model intricate biological processes." Ensure each sentence logically leads to the next, maintaining a smooth flow of ideas from broad to specific. Conclude the funnel by clearly stating the purpose and scope of your article, setting up the detailed exploration to follow. An example might be, "This perspective explores how hybrid AI models can address these integration challenges, offering a pathway to more comprehensive biological insights."
Identifying Current Gaps¶
Optimal length: 4 to 6 sentences
After establishing the background, it's crucial to identify current gaps in the research landscape. In four to six sentences, highlight one or two main challenges, gaps, or controversies directly relevant to your work. These issues should represent significant areas where current understanding or methodologies fall short. For example, "While machine learning models have advanced predictive accuracy, they often require extensive computational resources and lack interpretability." Briefly indicate how your perspective relates to these gaps, providing a rationale for your research focus. This connection justifies the importance and relevance of your work within the larger scientific conversation. However, avoid delving into the specifics of your approach or findings in this section; save detailed explanations for later in your article. Your goal is to piqué interest and establish context, laying the groundwork for your unique contribution.
Introducing Your Perspective¶
Optimal length: 2 to 3 sentences
Introducing your perspective involves presenting your main argument or approach clearly and concisely. In two to three sentences, state your thesis—the core idea or perspective your article will develop. Explain how it contributes to the field, addressing the gaps or challenges you've identified. Formulate a specific and debatable thesis statement that serves as the focal point of your article. For example, "This article proposes a novel hybrid AI model that combines deep learning with rule-based systems to enhance both the accuracy and interpretability of genomic data analysis." This statement not only introduces your main argument but also highlights its significance and potential impact on the field.
Outlining the Article Structure¶
Optimal length: 2 sentences
Concluding your introduction with an outline of the article structure provides a clear roadmap for your readers. In two sentences, briefly mention the main sections or key points you will cover, ensuring that the outline logically follows from your thesis and supports your overall argument. This helps guide your audience through the organization of your arguments and sets their expectations for the flow of your article. For instance, "Following this introduction, the article will review current AI applications in genomics, discuss the limitations of existing models, present the proposed hybrid AI framework, and explore its potential implications for future research and clinical practice." This overview ensures that readers can follow your reasoning and understand how each section contributes to your perspective.
Comprehensive Example of an Introduction¶
To illustrate how these components come together, consider the following complete introduction example:
Opening Statement: "The advent of single-cell RNA sequencing has revolutionized our understanding of cellular diversity in complex tissues."
Background Funnel: "Computational biology has become integral to modern biological research, enabling the analysis of vast datasets with unprecedented precision. Within this field, the integration of machine learning algorithms has transformed data interpretation and predictive modeling. Early applications of machine learning in genomics focused on pattern recognition, but recent advancements have allowed for more complex predictive capabilities. Deep learning, a subset of machine learning, utilizes neural networks with multiple layers to model intricate biological processes. Despite these advancements, challenges remain in integrating diverse data types to enhance predictive accuracy. This perspective explores how hybrid AI models can address these integration challenges, offering a pathway to more comprehensive biological insights."
Current Gaps: "While machine learning models have advanced predictive accuracy, they often require extensive computational resources and lack interpretability. Addressing these limitations is crucial for their practical application in clinical settings."
Your Perspective: "This article proposes a novel hybrid AI model that combines deep learning with rule-based systems to enhance both the accuracy and interpretability of genomic data analysis. By integrating these methodologies, the model offers a more robust framework for personalized medicine applications."
Article Structure: "Following this introduction, the article will review current AI applications in genomics, discuss the limitations of existing models, present the proposed hybrid AI framework, and explore its potential implications for future research and clinical practice."
Additional tips¶
When writing your introduction, remember to start broad and then narrow down. Begin with general information and progressively focus on your specific topic to guide readers smoothly into your perspective. Strive for clarity and conciseness, avoiding unnecessary jargon and overly complex sentences to ensure your introduction is accessible to a broad audience.
Engage the reader with compelling statements that pique interest and encourage them to continue reading. After drafting your introduction, revise and edit to refine the flow, clarity, and impact, ensuring each part serves its intended purpose effectively. Lastly, seek feedback from peers or instructors to receive constructive insights and make necessary improvements.
By following this comprehensive guide, you will be well-equipped to craft introductions that effectively set the stage for your scientific perspective articles. Engaging your readers and clearly presenting your unique viewpoints within the field of computational biology will lay a strong foundation for the rest of your work.
Field overview¶
Writing the Field Overview section of your scientific perspective article is crucial in situating your work within the broader context of computational biology. This section bridges your introduction and the detailed arguments or analyses that follow, allowing you to delve deeper into existing literature, highlight key findings, discuss collective contributions, and identify trends, gaps, or controversies pertinent to your perspective.
Purpose¶
The primary purpose of the Field Overview is multifaceted. It involves summarizing key findings from relevant papers, discussing how these studies collectively advance our understanding of the topic, highlighting any conflicting results or interpretations, and identifying emerging trends or patterns in the research. Additionally, it requires presenting and discussing key methods, data, and analyses, emphasizing their strengths and limitations. Including unifying themes or overarching results, as well as weaknesses, gaps, or inconsistencies, is essential to avoid simply summarizing papers. Instead, aim to analyze and synthesize the information critically, setting the stage for your unique contribution.
Structuring the Section Effectively¶
When crafting this section, structure it to enhance readability and coherence. Organize your content under headings that are assertion statements rather than mere topic labels. For example, instead of using a heading like "Machine Learning Models," opt for "Machine Learning Models Enhance Predictive Accuracy in Genomics." This approach communicates the main point of each section and guides the reader through your analysis. Ensure that the body of your Field Overview follows directly from the last paragraph of your introduction, maintaining narrative continuity without starting a new page.
Summarizing Key Findings¶
In summarizing key findings, select the most relevant and impactful papers within your chosen theme. Provide concise summaries of their main findings, focusing on how each relates to your perspective and the overarching themes of your article. For instance, you might write: "Smith et al. (2021) demonstrated that integrating deep learning with traditional algorithms significantly improves protein structure prediction accuracy." This approach ensures that your summaries are not just isolated facts but are connected to the larger narrative you are constructing.
Discussing Collective Contributions¶
Discuss how these studies contribute to the field by synthesizing the information to explain how the collective findings advance understanding, fill gaps, or open new research avenues. You could note: "Collectively, these studies illustrate a trend toward hybrid models that leverage the strengths of both machine learning and rule-based systems." By highlighting the research's cumulative impact, you provide context for the significance of your perspective within the field.
Highlighting Conflicting Results or Interpretations¶
Highlighting conflicting results or interpretations among the studies adds depth to your analysis. Identify any discrepancies or debates within the field and discuss their significance. For example: "While Jones et al. (2022) reported improved accuracy with unsupervised models, Lee and Kim (2023) found that supervised approaches yielded more reliable results in clinical settings." This discussion acknowledges different viewpoints and underscores the dynamic nature of scientific research, emphasizing the need for ongoing inquiry and exploration.
Identifying Emerging Trends or Patterns¶
Identifying emerging trends or patterns is another key aspect of the Field Overview. Point out new directions, patterns, or innovations shaping the field's future and discuss their impact. An example might be: "An emerging trend is the use of federated learning to address data privacy concerns in genomic analysis." By illuminating these trends, you help readers understand the evolving landscape and how your perspective fits within it.
Presenting and Discussing Key Methods, Data, and Analyses¶
Presenting and discussing key methods, data, and analyses provides insights into the methodologies and data underpinning the current research state. Emphasize innovative or widely adopted techniques and critically evaluate their strengths and limitations. For instance: "The use of convolutional neural networks allows for modeling spatial hierarchies in genomic data; however, this approach can be computationally intensive." This examination of methods helps readers appreciate the technical nuances and challenges in the field.
Including Unifying Themes and Identifying Gaps¶
Including unifying themes or overarching results, as well as weaknesses, gaps, or inconsistencies, helps to synthesize the literature and set the stage for your perspective. Point out themes or conclusions that emerge across multiple studies and highlight areas where the research could be more consistent. For example: "Despite advancements, a common weakness among current models is the lack of interpretability, which hinders their clinical adoption." By doing so, you avoid summarizing papers and provide a critical analysis that adds value to your article.
Maintaining Narrative Continuity¶
Ensure the Field Overview flows seamlessly from your introduction, maintaining narrative continuity. It should directly follow your introduction without starting a new page and avoid repetition of context or motivation already covered. Organize the content logically, with each section building upon the previous one. If necessary, incorporate subheadings to break down complex topics into manageable parts, but always ensure that all headings are assertion statements. This structure facilitates a smooth reading experience and reinforces the logical progression of your argument.
Presenting Your Perspective¶
The "Presenting Your Perspective" section is the heart of your scientific perspective article. This is where you articulate, explain, and defend your unique viewpoint on the topic you've chosen within computational biology. This section allows you to showcase your critical thinking skills, demonstrate your understanding of the field, and contribute to the ongoing scientific discourse.
Purpose¶
The primary purpose of this section is to:
- Clearly articulate your perspective or argument
- Provide evidence and reasoning to support your viewpoint
- Address potential counterarguments or limitations
- Discuss the implications and significance of your perspective
- Suggest future directions or applications of your ideas
Structuring Your Perspective¶
Articulating Your Perspective¶
Begin by restating your main argument or perspective, expanding on the thesis you introduced earlier. Clearly explain your viewpoint, ensuring it's specific, debatable, and relevant to the field of computational biology. For example:
"This perspective argues that integrating explainable AI techniques with current deep learning models in genomics can significantly enhance both the accuracy and interpretability of genetic variant analysis, addressing a critical gap in personalized medicine applications."
Elaborate on the key components of your perspective, breaking it down into its core ideas or propositions. Each of these should be clear, concise, and directly related to your main argument.
Supporting Your Perspective¶
Provide evidence and reasoning to support your viewpoint. This may include:
- Relevant research findings from your literature review
- Logical arguments based on established principles in computational biology
- Examples or case studies that illustrate your points
- Theoretical models or frameworks that underpin your perspective
For each piece of evidence or argument, explain how it supports your perspective. Use clear, concise language and maintain a logical flow between ideas.
Addressing Counterarguments¶
Acknowledge potential counterarguments or limitations to your perspective. This demonstrates critical thinking and strengthens your argument by showing you've considered alternative viewpoints. For each counterargument:
- Clearly state the opposing view
- Explain why it might be considered valid
- Provide a reasoned response that defends your perspective
For example: "Some researchers argue that the complexity of explainable AI models may reduce their efficiency in large-scale genomic analyses. However, recent advancements in computational power and optimized algorithms suggest that this trade-off is becoming increasingly negligible."
Implications and Significance¶
Discuss the potential impact and importance of your perspective within the field of computational biology. Consider:
How your perspective addresses current gaps or challenges in the field Potential applications or benefits of adopting your viewpoint Broader implications for related areas of research or practical applications
For instance: "By enhancing the interpretability of AI models in genomics, this approach could significantly improve clinicians' ability to make informed decisions based on genetic data, potentially revolutionizing personalized medicine practices."
Writing Tips¶
- Use clear, precise language: Avoid jargon where possible, and define technical terms when necessary.
- Maintain a logical flow: Use transition sentences to connect paragraphs and ideas smoothly.
- Be assertive but not overconfident: Use phrases like "This perspective suggests..." or "Evidence indicates..." rather than absolute statements.
- Incorporate relevant citations: Support your arguments with references to peer-reviewed literature, but avoid over-reliance on any single source.
- Use examples or analogies: When explaining complex ideas, consider using examples or analogies to make them more accessible.
- Be objective: While presenting your perspective, maintain a balanced and scholarly tone.
- Revise and refine: After writing, review your section to ensure clarity, coherence, and strong argumentation.
Common Pitfalls to Avoid¶
- Overgeneralization: Ensure your claims are specific and supported by evidence.
- Ignoring contradictory evidence: Address conflicting data or viewpoints directly.
- Lack of originality: While building on existing research, ensure your perspective offers a unique contribution.
- Weak argumentation: Each point should clearly support your main thesis.
- Overreliance on speculation: Ground your perspective in current research and logical reasoning.
Future Directions¶
The Future Directions section is a concise yet crucial component of your perspective paper. It serves as a bridge between your current analysis and potential advancements in the field. This brief section demonstrates your ability to think critically about the implications of your perspective and envision how it might shape future research in computational biology.
Purpose of the Future Directions Section¶
In just a few paragraphs, aim to:
- Suggest logical next steps based on your perspective
- Identify key research questions or hypotheses
- Briefly discuss potential methodological approaches
- Consider broader impacts on the field
Writing Your Future Directions (2-3 paragraphs total)¶
Paragraph 1: Immediate Research Opportunities¶
Outline the most direct and immediate research opportunities arising from your perspective. Focus on specific, feasible studies or experiments that could be conducted in the near future. For example:
"A logical next step would be to develop a prototype of the proposed hybrid AI model and test its performance on diverse genomic datasets. Comparing its accuracy and interpretability against current state-of-the-art models in predicting gene expression patterns could validate the approach's potential. Additionally, exploring how this model performs across different populations could address critical questions of generalizability in genomic analysis."
Paragraph 2: Long-term Trajectories and Challenges¶
Discuss potential long-term research trajectories and acknowledge challenges. Include more ambitious ideas that could shape the field's direction, while also considering obstacles. For instance:
"In the long term, this approach could pave the way for AI systems capable of autonomously generating and testing hypotheses about gene function and regulation, potentially accelerating discoveries in functional genomics. However, realizing this potential may require advancements in computational power and the development of more extensive, well-annotated genomic databases. Establishing international collaborations and data-sharing initiatives could help overcome these hurdles while ensuring model robustness across diverse populations."
Optional Paragraph 3: Interdisciplinary Connections¶
If space allows, briefly explore interdisciplinary connections:
"The explainable AI approach proposed here may have significant implications beyond computational biology. Collaborations with bioethicists and legal scholars could help shape guidelines for the responsible use of these technologies in clinical settings, addressing crucial questions about AI-assisted medical decisions and patient communication."
Writing Tips¶
- Be concise and focused: Given the limited space, prioritize the most impactful ideas.
- Maintain relevance: Ensure all proposed directions clearly connect to your main perspective.
- Balance specificity and breadth: Provide concrete suggestions while also touching on broader implications.
- Use conditional language: Given the speculative nature, use phrases like "could potentially" or "might lead to."
Common Pitfalls to Avoid¶
- Being too vague: Provide specific research questions or approaches, not general statements.
- Overreaching: Ensure your proposed directions are grounded in scientific possibility.
- Losing focus: Keep future directions relevant to your main perspective and computational biology.
By following these guidelines, you'll craft a concise yet impactful Future Directions section that effectively concludes your perspective paper and contributes valuable ideas to the field of computational biology.
Conclusions¶
In your perspective paper, it's important to include a brief concluding paragraph to wrap up your ideas effectively. This final section serves to reinforce your main points and leave a lasting impression on your readers.
Purpose of the Concluding Paragraph¶
The concluding paragraph should:
- Briefly restate the main thrust of your perspective
- Emphasize the key implications or potential impact of your viewpoint
- Provide a sense of closure to your paper
- End with a compelling final thought or call to action for the field
Structure and Content (3-5 sentences)¶
Your concluding paragraph should be concise, typically 3-5 sentences long. Here's a suggested structure:
- Opening sentence: Restate your main perspective or argument.
- 1-2 sentences: Summarize the key implications or significance of your viewpoint.
- 1-2 sentences: Highlight the broader impact on the field of computational biology.
- Final sentence: End with a strong, forward-looking statement or call to action.
Example Concluding Paragraph¶
"This perspective proposes that integrating explainable AI techniques with current deep learning models in genomics can significantly enhance both the accuracy and interpretability of genetic variant analysis. By addressing the critical challenges of model transparency and clinical applicability, this approach has the potential to accelerate the translation of genomic insights into improved patient care. The implications extend beyond personalized medicine, potentially revolutionizing our understanding of complex genetic disorders and paving the way for more targeted therapies. As computational biology continues to evolve, embracing such hybrid approaches may be key to unlocking the full potential of AI in healthcare and advancing our ability to decipher the complexities of the human genome."
Writing Tips¶
- Avoid introducing new information or arguments. Focus on summarizing and reinforcing your main points.
- Concentrate on the broader implications of your perspective within computational biology and related fields.
- Consider referencing ideas or questions posed in your introduction to provide a sense of closure.
- Conclude with a memorable statement that encapsulates the importance of your viewpoint and inspires further thought or action.
- While you're summarizing key points, try to rephrase rather than directly repeat sentences from earlier in your paper.
Common Pitfalls to Avoid¶
- Ensure your concluding thoughts are specific to your perspective.
- Avoid introducing doubts or weaknesses in your conclusion.
- While emphasizing impact is important, be cautious not to exaggerate the potential effects of your perspective.
- Ensure your conclusion flows naturally from the rest of your paper.
References¶
- Place your references starting on a new page after the conclusions with a
References
heading. - Use the author–date citation system in the APA style.
- References should be sorted alphabetically and not numbered.
- Pages do not count towards the requirement.
- Every reference must have a DOI. This excludes most webpages unless they are archived and version controlled.
Tables and Figures¶
- Only have one table or figure per page.
- Tables are placed first and sequentially numbered; then figures are placed last and sequentially numbered.
For example,
Table 1
,Table 2
,Figure 1
, etc. - Pages do not count towards the text requirement.
- If the table or figure is from an article, include the original figure legend and place it within a textbox with a border.
- You do not necessarily have to include the entire table or figure (e.g., you might show Panel A of the original figure).
- Write your own figure legend, placing it outside of the box.
- Make sure to cite the reference in the caption.
Ended: Components
Typesetting¶
A Google Docs template will be shared with you with approved typesetting.
- Page: Use standard letter-sized paper (8.5" x 11"). Set margins to 1 inch on all sides.
- Font: Use one of the following fonts: Arial, Roboto, Helvetica, Open Sans, Verdana, or Lato. Use a 12 pt font size for all normal text except headings, which should be 14 pt.
- Spacing: Use 1.5 spacing for the entire document, including the main text, headings, and references.
- Paragraphs: Align text to the left margin (i.e., ragged right). There should be a blank line between paragraphs with no indent.
- Headings: Headings should be 14 pt, centered, concise, and bolded. Subheadings should be 12 pt, bolded, and aligned to the left margin.
- Consistency: Maintain consistency in font usage, heading styles, spacing, and formatting throughout the document.
Style¶
Your paper should cater to the intermediate audience.
Everyone has their own preferred style of writing. While many things are subject to taste, there are still some suggested guidelines one should follow when possible.
- Avoid using
I
in your writing whenever possible. In a perspective article, it's acceptable to use a more personal tone. While you should still maintain academic rigor, you can use phrases like "In our view," or "We propose" when presenting your unique insights or interpretations. - Make sure to clearly distinguish between established facts from the literature and your own interpretations or proposals.
- Citations should always go before the period (i.e., left) instead of after. In-text citations are treated as part of the sentence. When writing for a scientific journal, you follow whatever their guidelines state.
- Use strong, clear language to articulate your perspective, but always support your points with evidence from the provided papers or other relevant sources.
-
For every sentence, ask yourself how a grumpy old scientist would respond.
Computational biology is revolutionizing our understanding of the world around us.
A grumpy scientist could think, "Oh, good, like I didn't already know this and read five papers with this same introduction." Use this to guide if your statement is too broad. - When listing items with
(1)
,(a)
, etc. make sure to add a,
or;
. For example, there is something to (a) show, (b) tell, © and see. - The Oxford comma is strongly recommended. - Avoid absolute or general statements like "best", "worst", "wrong", "only", etc. when someone could make an argument about some other approach being superior. For example, instead ofWe chose to use the ff19SB force field because it is the best for our system.
You can say,
Our MD simulations employed the ff19SB force field because previous publications [8, 9] have demonstrated its high performance for similar proteins.
Instead of saying,
Molecular dynamics is the best method.
you can say,
Molecular dynamics has shown promising results in similar studies. - It is important to have variety in your sentence structure and length. For example, have some short sentences, some longer ones, use semicolons, em dashes, etc. Variety makes it easier to read and maintain focus. - Minimize the amount of jumping around the reader has to do to follow your paper. For example, do not include a glossary at the end of your paper for technical terms. Explain them concisely in your writing.
Ended: Guidelines
Assignments ↵
Theme analysis¶
This assignment is designed to initiate your perspective paper by familiarizing you with our primers in computational biology and teaching you critical reading skills. Specifically, you will:
- Gain familiarity with current themes in computational biology;
- Practice identifying and summarizing key ideas from scientific literature;
- Begin to engage critically with research in your chosen area.
Please submit your assignment as a PDF using Canvas.
Instructions¶
- Choose one of the provided perspective primers from the course materials.
- Skim at least two to three key papers related to the chosen theme. These can be from the starting papers or found through your literature search in our previous lecture's activity. Provide full citations for each paper using the APA format.
- Write a summary containing at least 250 words that includes (in no particular order):
- The theme/perspective primer you've chosen.
- A brief overview of the main ideas presented in the papers you've read.
- Any key questions or challenges highlighted in the literature.
- Your initial thoughts on the importance or relevance of this theme in computational biology.
- Please put any questions or concepts you are having difficulty understanding into your respective discussion on Canvas.
Rubric¶
Criterion | Points | Description |
---|---|---|
Theme Selection | 12 | Student has identified one of the provided perspective primers. |
Paper Selection | 12 | Student has chosen 2-3 relevant papers related to the selected theme. |
Summary Content | 12 | The summary covers the main ideas from the chosen papers, identifies key questions or challenges, and provides initial thoughts on the theme's importance. |
Citations | 12 | Full citations are provided for all papers referenced. |
Writing Clarity | 12 | The summary is written clearly and concisely. |
Literature search¶
This assignment will help you identify relevant scientific articles for your upcoming perspective paper. The focus is on creating a works cited list and brief summaries to ensure you're on the right track with your research.
Tip
This assignment should take approximately two hours to complete. If you find yourself spending significantly more time, you may be reading the papers too in-depth at this stage. Focus on efficiently identifying relevant articles and their key points rather than detailed reading.
Objective¶
Practice finding appropriate scientific articles and creating brief summaries of their main points. This process mimics the initial stages of literature review for a graduate-level perspective piece.
Instructions¶
- Conduct a literature search to identify 7-10 relevant scientific articles related to your chosen perspective primer topic.
- Include at least five primary research articles.
- These articles will form the basis of your works cited for the perspective paper.
-
For each article, provide:
- Full citation in APA format
- A brief 2-3 sentence summary highlighting:
- Main research question or objective
- Key methodology (for primary articles)
- Principal findings or conclusions
Note
These summaries should be based on quickly scanning the abstract, introduction, and conclusion of each paper. Do not read the entire paper at this stage.
Submission Guidelines¶
- Submit your work as a PDF file through Canvas.
- Use 12-point font, 1.5 line spacing, and 1-inch margins.
Rubric¶
Criterion | Points | Description |
---|---|---|
Article Selection | 60 | Student has selected 7-10 relevant articles, including at least five primary research articles. The selection demonstrates a thoughtful approach to gathering diverse, relevant sources. |
Summary Content | 10 | Student has written a brief summary. |
Formatting | 10 | The submission follows the specified formatting guidelines. |
Introduction¶
This assignment is designed to guide you in drafting the introduction section of your perspective paper in computational biology. By applying the provided guidelines, you will structure your introduction effectively, synthesize the literature you have reviewed, and set the stage for presenting your unique perspective on your chosen topic.
Objectives:
- Understand the components of an effective introduction in a scientific perspective article.
- Synthesize existing literature to provide a comprehensive background for your topic.
- Develop a clear and engaging introduction that outlines your perspective and its significance within the field of computational biology.
- Practice academic writing standards, including proper citation and referencing.
Instructions¶
- Review Provided Guidelines:
- Refer to the Introduction Guide for Scientific Perspective Articles to understand the key elements required for a strong introduction.
- Utilize Previous Work:
- Incorporate the literature you have gathered and summarized in your previous assignments. Ensure that you reference and synthesize this literature to build a solid foundation for your introduction.
- Draft Your Introduction:
- Length: Approximately 1.5 pages (around 500-750 words).
- Content Requirements:
- Opening Statement: Capture the reader’s attention and establish the relevance of your topic.
- Background Funnel: Provide essential context, gradually narrowing down from broad concepts to your specific focus.
- Current Gaps: Identify key challenges or gaps in the current research that your perspective will address.
- Your Perspective: Clearly introduce your main argument or viewpoint, including a strong thesis statement.
- Article Structure: Outline the organization of your paper, briefly mentioning the main sections or key points you will cover.
- Use of Literature: Integrate and cite relevant literature to support your introduction.
- Formatting Requirements:
- File Format: Submit your draft as a PDF file.
- Font: Use a readable 12-point font.
- Spacing: 1.5 line spacing.
- Margins: 1-inch margins on all sides.
- Header: Include a header with your name, the course number, and the date (e.g., “Jane Doe | BIOSC 1630 | October 4, 2024”).
- References: Provide a reference list at the end of your introduction. Use the APA format.
- Submission Instructions:
- Upload your PDF file to the designated assignment link on Canvas by the due date and time.
Submission Guidelines¶
- File Naming: Use the format
Lastname_Firstname_IntroDraft.pdf
(e.g.,Doe_Jane_IntroDraft.pdf
). - Proofread: Ensure your draft is free from grammatical errors and follows the formatting guidelines.
- References: Double-check that all in-text citations correspond to entries in your reference list and adhere to the chosen citation style.
- Plagiarism: Ensure all sources are properly cited to avoid plagiarism. Use plagiarism detection tools if available.
Rubric¶
Your introduction draft will be evaluated based on the following criteria. The total possible points for this assignment are 60 points.
Criterion | Points | Description |
---|---|---|
Opening Statement | 10 | Effectively engages the audience, establishes context, and sets the tone for the perspective. The opening should be compelling, relevant, and concise, drawing the reader into the topic. It should clearly introduce the specific area of computational biology being addressed. |
Background Funnel | 10 | Provides a clear progression from broad concepts to the specific topic, includes relevant history and critical concepts. The background should logically narrow down from general information about computational biology to the specific focus of the paper, including key milestones and essential concepts. |
Current Gaps | 10 | Identifies relevant challenges or controversies and links them to the perspective. The draft should clearly outline one or two main gaps in the current research and explain how these gaps justify the need for the perspective being presented. |
Your Perspective | 10 | Clearly introduces the main argument and its contribution, with a strong thesis statement. The perspective should be well-articulated, demonstrating originality and relevance. The thesis statement should be specific, debatable, and set up the direction for the rest of the paper. |
Article Structure | 5 | Effectively outlines the rest of the article and key points to be covered. The structure should provide a brief roadmap of the main sections or arguments that will be discussed, ensuring logical flow and coherence with the thesis. |
Use of Literature | 10 | Effectively incorporates and cites relevant literature throughout the introduction. The draft should demonstrate a strong understanding of existing research, citing seminal and recent studies appropriately. References should support the background, gaps, and perspective presented. |
Writing Quality | 5 | The writing is clear, concise, and appropriate for an undergraduate audience in computational biology. The draft should be well-organized, free of grammatical errors, and use appropriate academic language. Clarity and readability are essential for effective communication. |
Detailed Rubric Breakdown
- Opening Statement (10 points)
- 10-9: Exceptionally engaging and relevant; clearly sets the tone and context.
- 8-7: Engaging and relevant; sets a clear tone and context.
- 6-5: Moderately engaging; some relevance to the topic.
- 4-0: Lacks engagement or relevance; does not effectively introduce the topic.
- Background Funnel (10 points)
- 10-9: Comprehensive and logically structured; smoothly narrows from broad to specific.
- 8-7: Clear progression with relevant information; minor gaps in flow.
- 6-5: Adequate background but lacks some logical flow or depth.
- 4-0: Incomplete or poorly structured; fails to provide necessary context.
- Current Gaps (10 points)
- 10-9: Clearly identifies significant gaps; effectively links to the perspective.
- 8-7: Identifies relevant gaps with a clear connection to the perspective.
- 6-5: Identifies gaps but with a weak connection to the perspective.
- 4-0: Fails to identify relevant gaps or link them to the perspective.
- Your Perspective (10 points)
- 10-9: Thesis is clear, specific, and highly relevant; demonstrates originality.
- 8-7: Clear and relevant thesis; adequately sets up the paper.
- 6-5: Thesis is present but may lack specificity or relevance.
- 4-0: Thesis is unclear, vague, or missing.
- Article Structure (5 points)
- 5-4: Clearly outlines the structure with logical flow.
- 3-2: Provides a basic outline but may lack some clarity or detail.
- 1-0: Vague or incomplete outline of the article structure.
- Use of Literature (10 points)
- 10-9: Integrates a wide range of relevant literature seamlessly; citations are accurate.
- 8-7: Uses relevant literature effectively with minor citation issues.
- 6-5: Incorporates literature but may lack depth or have some citation errors.
- 4-0: Minimal or ineffective use of literature; numerous citation errors.
- Writing Quality (5 points)
- 5-4: Exceptionally clear, concise, and free of errors.
- 3-2: Generally clear with few errors; minor issues with conciseness.
- 1-0: Unclear, wordy, or contains multiple grammatical errors.
Additional Tips for Success¶
- Start Early: Begin drafting well before the due date to allow ample time for revisions.
- Follow the Guide: Refer to the Introduction Guide for Scientific Perspective Articles to ensure all components are addressed.
- Be Selective with Sources: Choose the most relevant and impactful literature to support your introduction.
- Maintain Clarity: Use straightforward language and avoid unnecessary jargon. Define specialized terms as needed.
- Revise Thoroughly: Review your draft multiple times to enhance clarity, coherence, and flow.
- Seek Feedback: Share your draft with peers or utilize writing center resources for constructive feedback before submission.
Field overview¶
Write a comprehensive "Field Overview" section for your perspective paper, providing the necessary background and context for your chosen topic.
Instructions¶
- Length: Approximately 2.5 pages.
- Content Requirements:
- Thoroughly define and explain fundamental concepts and terminology, offering more depth than in the introduction.
- Provide a comprehensive summary of the current state of the field, including recent breakthroughs and ongoing research.
- Analyze major debates or points of contention in detail, presenting various viewpoints objectively.
- Discuss the relevance and potential impacts of the topic more extensively.
- Research:
- Use a minimum of 5-7 peer-reviewed sources.
- Include both seminal works and recent publications (within the last 3-5 years) to show the evolution of ideas.
- Writing Style:
- Maintain an objective tone, presenting information without bias.
- Use clear, concise language appropriate for an audience with a basic understanding of computational biology.
- Properly cite all sources using the APA format.
- Organization:
- Use descriptive subheadings to structure your overview logically.
- Ensure smooth transitions between different aspects of the field.
- Develop each subtopic more fully than in the introduction, providing greater depth and breadth.
- Synthesis:
- Identify trends, patterns, or themes in the research you've reviewed.
- Discuss how different studies or approaches complement or contradict each other.
- Conclusion:
- End with a brief paragraph that ties the overview to your specific perspective topic, setting the stage for your argument.
Detailed Rubric¶
Your Field Overview will be evaluated based on the following criteria.
Criterion | Points | Description |
---|---|---|
Key Concepts | 30 | Clearly defines and explains fundamental concepts and terminology. Provides accurate and concise explanations. Uses appropriate scientific language. |
Current State | 30 | Comprehensively summarizes current understanding and major developments. Highlights recent advancements and ongoing research. Demonstrates awareness of cutting-edge work in the field. |
Major Debates | 30 | Identifies and describes main points of contention or debates. Presents different viewpoints objectively. Shows understanding of complexities in the field. |
Relevance | 8 | Effectively explains the significance of the topic. Discusses potential impacts on computational biology and related fields. Demonstrates the topic's importance in the broader scientific context. |
Research Quality | 8 | Uses appropriate and diverse sources. Includes both seminal works and recent publications. Demonstrates thorough research and understanding of the literature. |
Writing Quality | 8 | Writing is clear, concise, and well-organized. Uses proper grammar and spelling. Maintains an academic tone throughout. |
Citations and Formatting | 6 | Properly cites all sources using APA format. Adheres to submission guidelines for formatting. Includes a complete and correctly formatted reference list. |
Detailed rubric
- Key Concepts (30 points)
- 30-27: All relevant concepts clearly defined and explained; appropriate use of technical terms.
- 26-24: Most key concepts covered; explanations mostly clear and accurate.
- 23-20: Some important concepts missing or poorly explained.
- 19-0: Many key concepts missing or incorrectly explained.
- Current State (30 points)
- 30-27: Comprehensive, up-to-date summary of the field; includes cutting-edge research.
- 26-24: Good overview of current state; may miss some recent developments.
- 23-20: Basic summary of current state; lacks depth or misses important recent work.
- 19-0: Outdated or inaccurate representation of the current state.
- Major Debates (30 points)
- 30-27: Clearly identifies and explains major debates; presents multiple viewpoints objectively.
- 26-24: Covers main debates but may lack some nuance or balance.
- 23-20: Mentions debates but lacks depth or misses important perspectives.
- 19-0: Fails to identify major debates or presents them inaccurately.
- Relevance (8 points)
- 8-7: Clearly explains significance and potential impacts; strong connections to broader field.
- 6-5: Good explanation of relevance but may lack some depth or connections.
- 4-3: Basic explanation of relevance; lacks strong connections or implications.
- 2-0: Fails to adequately explain relevance or makes inaccurate connections.
- Research Quality (8 points)
- 8-7: Excellent range of high-quality, relevant sources; balance of seminal and recent works.
- 6-5: Good variety of sources; may slightly favor older or newer publications.
- 4-3: Adequate sources but lacks diversity or misses some important references.
- 2-0: Poor selection of sources; overreliance on outdated or non-peer-reviewed materials.
- Writing Quality (8 points)
- 8-7: Exceptionally clear, concise, and well-organized; free of grammatical errors.
- 6-5: Generally clear and well-organized; minor grammatical issues.
- 4-3: Some clarity or organization issues; several grammatical errors.
- 2-0: Poorly written; numerous grammatical errors; lacks clear organization.
- Citations and Formatting (6 points)
- 6-5: Perfect APA citations and formatting; complete and correct reference list.
- 4-3: Minor citation or formatting errors; reference list mostly correct.
- 2-1: Several citation or formatting errors; incomplete or poorly formatted reference list.
- 0: Major citation or formatting issues; missing or severely flawed reference list.
Additional guidance¶
Remember, the goal of this section is to provide a solid foundation for understanding your perspective. Take your time to research thoroughly and present the information clearly and logically.
- Start with an Outline: Before writing, create a detailed outline of your Field Overview. This will help you organize your thoughts and ensure you cover all necessary points.
- Use Topic Sentences: Begin each paragraph with a clear topic sentence that introduces the main idea. This helps with organization and makes your writing easier to follow.
- Explain Technical Terms: When introducing a new concept or technical term, always provide a brief, clear explanation. Don't assume your reader knows all the terminology.
- Use Transitional Phrases: To improve flow between paragraphs and sections, use transitional phrases like "Furthermore," "In contrast," "Similarly," or "However."
- Avoid Jargon and Acronyms: While some technical language is necessary, avoid excessive jargon. Always spell out acronyms on first use.
- Be Concise: Aim for clarity and brevity. Avoid unnecessarily complex sentences or repetition.
- Use Evidence: Support your statements with evidence from your research. This often involves citing relevant studies or quoting experts in the field.
- Revise and Edit: After writing your first draft, set it aside for a day, then return to revise. Look for areas to improve clarity, flow, and accuracy.
Tips for Success¶
- Critically evaluate your sources, ensuring they are reputable and relevant.
- Have a peer review your work for clarity and comprehensiveness.
- Revise and refine your overview to ensure it provides a solid foundation for your perspective.
Common feedback¶
- Instead of using headings like "Key Concepts" or "Molecular Dynamics", use assertion statements like, "Molecular simulations are invaluable for novel drug targets".
- In-text citations should be in the format of:
(Maldonado et al., 2024)
. - Ensure that your writing is academically formal.
-
When talking about other methods or packages, it is crucial that provide each with an individual citation. It is better to add a citation than leave it out. For example,
There are many MD simulation packages such as AMBER (CITE), NAMD (CITE), and GROMACS (CITE). 5. Always remember to ask yourself, "What value does this sentence have here?" and "Would the reader be able to understand the perspective without this sentence"?
Analysis¶
Write the "Analysis" section of your perspective article, articulating and defending your unique viewpoint on your chosen topic.
Instructions¶
- Length: Approximately 1 to 1.5 pages.
- Content Requirements:
- Clearly articulate your perspective or argument
- Provide evidence and reasoning to support your viewpoint
- Address potential counterarguments or limitations
- Discuss the implications and significance of your perspective
- Writing Style:
- Use clear, precise language appropriate for an academic audience
- Maintain an objective tone while presenting your argument
- Use proper citations in APA format
Rubric¶
Your analysis section will be evaluated based on the following criteria.
Criterion | Points | Description |
---|---|---|
Clarity of Perspective | 30 | Clear articulation of a specific, debatable perspective relevant to computational biology |
Evidence and Reasoning | 25 | Strong support for the perspective with relevant research, logical arguments, and examples |
Counterarguments | 20 | Thoughtful consideration and address of potential counterarguments or limitations |
Implications and Significance | 15 | Insightful discussion of the potential impact and importance of the perspective |
Organization and Flow | 15 | Logical structure and smooth transitions between ideas |
Writing Quality | 10 | Clear, concise prose with appropriate academic tone and minimal errors |
Use of Sources | 5 | Appropriate integration and citation of relevant, peer-reviewed sources |
Detailed Rubric Breakdown¶
- Clarity of Perspective (30 points)
- 30-25: Perspective is exceptionally clear, specific, and relevant to computational biology
- 24-19: Perspective is clear and relevant, with minor areas for improvement in specificity
- 18-13: Perspective is somewhat clear but may lack specificity or relevance
- 12-0: Perspective is unclear, overly broad, or not relevant to computational biology
- Evidence and Reasoning (25 points)
- 25-21: Exceptional use of evidence and reasoning, with a strong connection to the perspective
- 20-16: Good use of evidence and reasoning, with clear connections to the perspective
- 15-11: Adequate evidence and reasoning, but connections to the perspective may be weak
- 10-0: Insufficient or irrelevant evidence, poor reasoning
- Counterarguments (20 points)
- 20-17: Thorough and thoughtful address of potential counterarguments or limitations
- 16-13: Good consideration of counterarguments, with minor areas for improvement
- 12-9: Some counterarguments addressed, but treatment may be superficial
- 8-0: Counterarguments ignored or inadequately addressed
- Implications and Significance (15 points)
- 15-13: Insightful and comprehensive discussion of implications and significance
- 12-10: Good discussion of implications and significance, with some depth
- 9-7: Basic discussion of implications and significance, lacking depth
- 6-0: Little or no discussion of implications and significance
- Organization and Flow (15 points)
- 15-13: Exceptional organization with seamless flow between ideas
- 12-10: Good organization and flow, with minor issues
- 9-7: Adequate organization, but flow between ideas may be choppy
- 6-0: Poor organization and flow, ideas are difficult to follow
- Writing Quality (10 points)
- 10-9: Excellent writing quality with clear, concise prose and appropriate academic tone
- 8-7: Good writing quality with minor errors or awkward phrasing
- 6-5: Adequate writing quality, but with noticeable errors or inconsistencies in tone
- 4-0: Poor writing quality with numerous errors or inappropriate tone
- Use of Sources (5 points)
- 5: Excellent integration of relevant, peer-reviewed sources with proper citations
- 4: Good use of sources with minor issues in integration or citation
- 3: Adequate use of sources, but may rely too heavily on a single source or have citation errors
- 2-0: Poor use of sources, lack of peer-reviewed sources, or major citation errors
Tips for Success¶
Remember, the goal is not just to state your viewpoint, but to persuade your readers of its validity and importance through careful argumentation and evidence.
- Start by clearly defining your perspective before diving into supporting evidence
- Use topic sentences to guide your reader through your argument
- Ensure each paragraph contributes directly to supporting your main perspective
- When addressing counterarguments, be fair in your representation of opposing views
- Use concrete examples or case studies to illustrate your points
- Review and revise your work, paying attention to the logical flow of your argument
Draft¶
Submit a complete, polished draft of your perspective paper in computational biology, incorporating all sections including the new "Future Directions" and "Conclusion" sections.
Instructions¶
- Length: 7-9 pages of main text (excluding cover page, tables, figures, and references).
- Required Sections:
- Cover Page
- Introduction
- Field Overview
- Analysis
- Future Directions (NEW)
- Conclusion (NEW)
- References
- Tables and Figures (if applicable)
- New Sections:
- Future Directions (1-2 paragraphs):
- Suggest logical next steps based on your perspective
- Identify key research questions or hypotheses
- Briefly discuss potential methodological approaches
- Consider broader impacts on the field
- Conclusion (1 paragraph, 3-5 sentences):
- Briefly restate your main perspective
- Emphasize key implications or potential impact
- Provide a sense of closure
- End with a compelling final thought or call to action
- Future Directions (1-2 paragraphs):
- Formatting Requirements:
- Page: Standard letter-sized (8.5" x 11"), 1-inch margins on all sides
- Font: Arial, Roboto, Helvetica, Open Sans, Verdana, or Lato
- Font size: 12 pt for normal text, 14 pt for headings
- Spacing: 1.5 spacing throughout the document
- Paragraphs: Left-aligned, blank line between paragraphs, no indent
- Headings: 14 pt, centered, concise, and bolded
- Subheadings: 12 pt, bolded, left-aligned
- Cover Page:
- Title of your paper
- Your name
- Course title
- Date of submission
- Tables and Figures:
- One table or figure per page.
- Number sequentially (Table 1, Table 2, Figure 1, etc.).
- Include original figure legend in a bordered textbox if from an article.
- Write your own figure legend outside the box.
- Cite the reference in the caption.
- References:
- Start on a new page after the conclusion.
- Use APA style with author-date citation system.
- Sort alphabetically, not numbered.
- Include DOI for every reference.
Submission Guidelines¶
- Submit your draft as a PDF file.
- Use the file naming format:
LastName_FirstName_PerspectiveDraft.pdf
. - Submit through the designated assignment link on Canvas.
Rubric¶
Your draft will be evaluated based on the following criteria, for a total of 80 points. Remember, this draft should incorporate feedback from previous assignments and demonstrating your ability to articulate a well-reasoned perspective.
Content and Argumentation: 40 points
Score | Description |
---|---|
40-36 | Presents a clear, original perspective with strong supporting evidence. Thoroughly addresses counterarguments. Implications are insightfully discussed. |
35-32 | Perspective is clear with good supporting evidence. Most counterarguments are addressed. Implications are well-discussed. |
31-28 | Perspective is present but could be clearer. Some supporting evidence provided. Some counterarguments addressed. Basic discussion of implications. |
27-0 | Perspective is unclear or weak. Limited supporting evidence. Counterarguments not adequately addressed. Limited or no discussion of implications. |
Organization and Coherence: 15 points
Score | Description |
---|---|
15-14 | Excellent structure with smooth transitions. Ideas flow logically and cohesively. |
13-12 | Good structure with clear transitions. Ideas generally flow well. |
11-10 | Adequate structure but some transitions are unclear. Some ideas may seem disconnected. |
9-0 | Poor structure with abrupt transitions. Ideas often seem disconnected or illogical. |
Writing Quality and Style: 15 points
Score | Description |
---|---|
15-14 | Excellent prose with varied sentence structure. Appropriate academic tone maintained throughout. Minimal to no errors. |
13-12 | Good writing quality with some variety in sentence structure. Generally appropriate tone. Few minor errors. |
11-10 | Adequate writing quality but limited sentence variety. Tone may be inconsistent. Several minor errors. |
9-0 | Poor writing quality with little sentence variety. Inappropriate tone. Numerous errors that impede understanding. |
Adherence to Formatting and Style Guidelines: 10 points
Score | Description |
---|---|
10-9 | Perfectly adheres to all formatting and style guidelines. |
8-7 | Minor deviations from formatting or style guidelines. |
6-5 | Several noticeable deviations from formatting or style guidelines. |
4-0 | Significant deviations from formatting or style guidelines. |
Final draft¶
The final draft of your perspective paper in computational biology is optional. If you are satisfied with the grade received on your draft, that same grade will be applied to your final draft. However, if you would like to improve your grade, you may incorporate feedback from your draft and submit a final version for reassessment.
Note
- Your final draft grade cannot go down as long as no drastic detrimental changes are made.
- If you are happy with your draft grade, you do not need to submit a final draft.
Rubric¶
The rubric is the same as the draft, but scaled to 180 points.
Content and Argumentation: 90 points
Score | Description |
---|---|
90-81 | Presents a clear, original perspective with strong supporting evidence. Thoroughly addresses counterarguments. Implications are insightfully discussed. |
80-72 | Perspective is clear with good supporting evidence. Most counterarguments are addressed. Implications are well-discussed. |
71-63 | Perspective is present but could be clearer. Some supporting evidence provided. Some counterarguments addressed. Basic discussion of implications. |
62-0 | Perspective is unclear or weak. Limited supporting evidence. Counterarguments not adequately addressed. Limited or no discussion of implications. |
Organization and Coherence: 30 points
Score | Description |
---|---|
30-27 | Excellent structure with smooth transitions. Ideas flow logically and cohesively. |
26-24 | Good structure with clear transitions. Ideas generally flow well. |
23-20 | Adequate structure but some transitions are unclear. Some ideas may seem disconnected. |
19-0 | Poor structure with abrupt transitions. Ideas often seem disconnected or illogical. |
Writing Quality and Style: 30 points
Score | Description |
---|---|
30-27 | Excellent prose with varied sentence structure. Appropriate academic tone maintained throughout. Minimal to no errors. |
26-24 | Good writing quality with some variety in sentence structure. Generally appropriate tone. Few minor errors. |
23-20 | Adequate writing quality but limited sentence variety. Tone may be inconsistent. Several minor errors. |
19-0 | Poor writing quality with little sentence variety. Inappropriate tone. Numerous errors that impede understanding. |
Adherence to Formatting and Style Guidelines: 30 points
Score | Description |
---|---|
30-27 | Perfectly adheres to all formatting and style guidelines. |
26-24 | Minor deviations from formatting or style guidelines. |
23-20 | Several noticeable deviations from formatting or style guidelines. |
19-0 | Significant deviations from formatting or style guidelines. |
Ended: Assignments
Ended: Perspective paper
Activities¶
Ended: Assessments
Resources ↵
Resources¶
TODO:
Computational biology ↵
Computational biology¶
TODO:
Scientific journals¶
Here are journals that could feature computational biology research. This is not exhaustive, but focuses on the well-known journals.
Computational Biology and Bioinformatics¶
Scope: Bioinformatics provides a forum for the exchange of information in the fields of computational molecular biology and genome bioinformatics, with emphasis on new algorithms and databases that advance the progress of bioinformatics and biomedical research in a significant manner.
Scope: BMC Bioinformatics is an open access, peer-reviewed journal that considers articles describing novel computational algorithms and software, models and tools, including statistical methods, machine learning and artificial intelligence, as well as systems biology.
Scope: Nucleic Acids Research (NAR) publishes the results of leading-edge research into physical, chemical, biochemical and biological aspects of nucleic acids and proteins involved in nucleic acid metabolism and/or interactions. It enables the rapid publication of papers under the following categories:
- Chemistry and Nucleic Acid Chemistry
- Computational biology
- Data Resources and Analyses
- Gene Regulation
- Chromatin and Epigenetics
- Genome Integrity, Repair and Replication
- Genomics
- Molecular Biology
- Nucleic Acid Enzymes
- RNA and RNA-protein complexes
- Structural Biology
- Synthetic Biology and Bioengineering
Scope: Briefings in Bioinformatics aims to provide an indispensable resource for the experimental practitioner seeking awareness of the disparate sources of data and analytical tools of contemporary biology, biotechnology and medicine based on an explicit molecular description. This includes all areas of genomics, proteomics, lipidomics, glycomics, metabolomics, interactomics and network biology, imaging, systems biology, phenomics, chemoinformatics, computational biology and clinical/medical informatics that have a clear molecular foundation to the study. Large-scale instrumentation and computerisation is reducing the time that needs to be spent in the laboratory. Instead, the rate-limiting step is the analysis and interpretation of data.
Scope: PLOS Computational Biology features works of exceptional significance that further our understanding of living systems at all scales—from molecules and cells, to patient populations and ecosystems—through the application of computational methods. Readers include life and computational scientists, who can take the important findings presented here to the next level of discovery.
Journal of Computational Biology
Scope: Journal of Computational Biology publishes articles whose primary contributions are the development and application of new methods in computational biology, including algorithmic, statistical, mathematical, machine learning and artificial intelligence contributions. The journal welcomes novel methods that tackle established problems within computational biology; novel methods and frameworks that anticipate new problems and data types arising in computational biology; and novel methods that are inspired from studying natural computation. Methods should be tested on real and/or simulated biological data whenever feasible. Papers whose primary contributions are theoretical are also welcome. Available only online, this is an essential journal for scientists and students who want to keep abreast of developments in bioinformatics and computational biology.
Computational Biology and Chemistry
Scope: Computational Biology and Chemistry publishes original research papers and review articles in all areas of computational life sciences. High quality research contributions with a major computational component in the areas of nucleic acid and protein sequence research, molecular evolution, molecular genetics (functional genomics and proteomics), theory and practice of either biology-specific or chemical-biology-specific modeling, and structural biology of nucleic acids and proteins are particularly welcome. Exceptionally high quality research work in bioinformatics, systems biology, ecology, computational pharmacology, metabolism, biomedical engineering, epidemiology, and statistical genetics will also be considered.
Chemical Informatics and Molecular Modeling¶
Scope: Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling.
Coverage includes, but is not limited to:
- chemical information systems, software and databases, and molecular modelling
- chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases
- computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques
Journal of Chemical Information and Modeling
Scope: The Journal of Chemical Information and Modeling (JCIM) publishes papers reporting new methodologies in chemical informatics and molecular modeling and its applications with experimental validation. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer simulation using molecular dynamics and free energy methods, machine learning on chemical and biological data, combined quantum mechanical/molecular mechanical (QM/MM) multi-scale simulations, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery. JCIM will not consider straightforward applications of molecular docking methods to a single target system without adequate experimental validation.
Journal of Computer-Aided Molecular Design
Scope: The Journal of Computer-Aided Molecular Design provides a forum for disseminating information on both the theory and the application of computer-based methods in the analysis and design of molecules. The scope of the journal encompasses papers which report new and original research and applications in the following areas:
- theoretical chemistry
- computational chemistry
- computer and molecular graphics
- molecular modeling
- protein engineering
- drug design
- expert systems
- general structure-property relationships
- molecular dynamics
- chemical database development and usage
Contributions on computer-aided molecular modeling studies in pharmaceutical, polymer, materials and surface sciences, as well as other molecular-based disciplines, are particularly welcome.
Theoretical and Computational Chemistry¶
Journal of Theoretical Biology
Scope: The Journal of Theoretical Biology is the leading forum for theoretical perspectives that give insight into biological processes. It covers a very wide range of topics and is of interest to biologists in many areas of research.
Journal of Chemical Theory and Computation
Scope: The Journal of Chemical Theory and Computation publishes papers reporting new theories, methodology in quantum electronic structure, molecular dynamics, and statistical mechanics and/or their important applications. Specific topics include advances in ab initio quantum mechanics, density functional theory, design and properties of new materials, surface science, Monte Carlo simulations, solvation models, QM/MM calculations, biomolecular structure prediction, and molecular dynamics in the broadest sense, including gas-phase dynamics, ab initio dynamics, biomolecular dynamics, and protein folding/phase separation. New theories for quantum computers and their applications are welcome, as well as the combination of data science, theory, and computations relevant to chemistry. We also welcome papers on computational chemistry packages that present innovative methods. The Journal does not consider papers that are straightforward applications on only single-class systems of well-established methods, including DFT, traditional wave function theories, and molecular dynamics. The Journal favors submissions that include advances in theory, methodology, and data science with applications to compelling problems in chemistry and materials science.
Scope: Our journal has a wide-ranging scope which covers the full breadth of the chemical sciences. The research we publish contains the sorts of novel ideas, challenging questions and progressive thinking that bring undiscovered breakthroughs within reach.
Your paper could focus on a single area, or cross many. It could be beyond the accepted bounds of the chemical sciences. It might address an immediate challenge, contribute to a future breakthrough or be wholly conceptual.
We’re a team from every field of the chemical sciences, and know from experience that breakthroughs that drive the solutions to global challenges can come from anywhere, at any time. You could even start an entirely new area of research.
Structural Biology¶
Scope: Structure aims to publish papers of exceptional interest in the field of structural biology. The journal strives to be essential reading for structural biologists, as well as biologists and biochemists that are interested in macromolecular structure and function. Structure strongly encourages the submission of manuscripts that present structural and molecular insights into biological function and mechanism. Other reports that address fundamental questions in structural biology, such as structure-based examinations of protein evolution, folding, and/or design, will also be considered. We will consider the application of any method, experimental or computational, at high or low resolution, to conduct structural investigations, as long as the method is appropriate for the biological, functional, and mechanistic question(s) being addressed. Likewise, reports describing single-molecule analysis of biological mechanisms are welcome. TODO:
Scope: BJ publishes original articles, letters, and perspectives on important problems in modern biophysics. Papers should be written to be of interest to a broad community of biophysicists. BJ welcomes experimental studies that use quantitative physical approaches for the study of biological systems, including or spanning scales from the molecule to the whole organism. Experimental studies of a purely descriptive or phenomenological nature, with no theoretical or mechanistic underpinning, are not appropriate for publication in BJ. Theoretical studies should offer new insights into the understanding of experimental results or suggest new experimentally testable hypotheses. Articles reporting significant methodological or technological advances, which have potential to open new areas of biophysical investigation, are also suitable for publication in BJ. Papers describing improvements in the accuracy or speed of existing methods or extra detail within methods described previously are not suitable for BJ.
PROTEINS: Structure, Function, and Bioinformatics
Scope: PROTEINS: Structure, Function, and Bioinformatics publishes original reports of significant experimental and analytic research in all areas of protein research: structure, function, computation, genetics, and design. The journal encourages reports that present new experimental or computational approaches for interpreting and understanding data from biophysical chemistry, structural studies of proteins and macromolecular assemblies, alterations of protein structure and function engineered through techniques of molecular biology and genetics, functional analyses under physiologic conditions, as well as the interactions of proteins with receptors, nucleic acids, or other specific ligands or substrates. Research in protein and peptide biochemistry directed toward synthesizing or characterizing molecules that simulate aspects of the activity of proteins, or that act as inhibitors of protein function, is also within the scope of PROTEINS.
Scope: Protein Science, the flagship journal of The Protein Society, serves as an international forum for publishing original reports on all scientific aspects of protein molecules. The Journal publishes papers by leading scientists from all over the world that report on advances in the understanding of proteins in the broadest sense. Protein Science aims to unify this field by cutting across established disciplinary lines and focusing on “protein-centered” science.
Neuroscience¶
Scope: ACS Chemical Neuroscience publishes high-quality research articles and reviews that showcase chemical, quantitative biological, biophysical and bioengineering approaches to the understanding of the nervous system and to the development of new treatments for neurological disorders. Research in the journal focuses on aspects of chemical neurobiology and bio-neurochemistry.
Biology¶
Journal scopes that encompass mainly biology, but still overlaps with computational biology.
Scope: PLOS Biology is the flagship PLOS journal in the life sciences and features works of exceptional significance, originality, and relevance in all areas of biological science and at every scale; from molecules to ecosystems, including works at the interface of other disciplines. We also welcome data-driven meta-research articles that evaluate and aim to improve the standards of research in the life sciences and beyond.
Scope: The journal's scope includes diverse biological areas: insights into molecular structure and function, cancer biology, cell biology, genetics, developmental biology, computational methods, global health, evolutionary biology, immunology, inflammation, neuroscience, plant biology, stem cell biology, regenerative medicine, and molecular mechanisms. It welcomes studies offering novel insights that encompasses experimental and computational approaches.
Scope: Cell publishes findings of unusual significance in any area of experimental biology, including but not limited to cell biology, molecular biology, neuroscience, immunology, virology and microbiology, cancer, human genetics, systems biology, signaling, and disease mechanisms and therapeutics. The basic criterion for considering papers is whether the results provide significant conceptual advances into, or raise provocative questions and hypotheses regarding, an interesting and important biological question. In addition to primary research articles in four formats, Cell features review and opinion articles on recent research advances and issues of interest to its broad readership in the leading edge section.
Scope: ACS Chemical Biology provides an international forum for the rapid communication of research that broadly embraces the interface between chemistry and biology.
The journal also serves as a forum to facilitate the communication between biologists and chemists that will translate into new research opportunities and discoveries. Results will be published in which molecular reasoning has been used to probe questions through in vitro investigations, cell biological methods, or organismic studies.
We welcome mechanistic studies on proteins, nucleic acids, sugars, lipids, and nonbiological polymers. The journal serves a large scientific community, exploring cellular function from both chemical and biological perspectives. It is understood that submitted work is based upon original results and has not been published previously.
Journal of Physical Chemistry B
Scope: The Journal of Physical Chemistry B (JPC B) publishes experimental, theoretical and computational research in the area of biophysics, biochemistry, biomaterials, and soft matter. Examples of topics of special interest include: biomolecules (proteins, nucleic acids, membranes, enzyme catalysis); biomaterials (including nano-biomaterials); polymers and colloids; liquids (properties of liquids, ionic liquids, deep eutectic solvents, and fluid interfaces, and solid-liquid interfaces, bulk studies of electrolytes); surfactants; glasses; and spectroscopy, charge, and energy transfer of molecules in solution.
Scope: Biochemistry provides an international forum for publishing exceptional, rigorous, high-impact research across all of biological chemistry. This broad scope includes studies on the chemical, physical, mechanistic, and/or structural basis of biological or cell function, and encompasses the fields of chemical biology, synthetic biology, disease biology, cell biology, nucleic acid biology, neuroscience, structural biology, and biophysics. In addition to traditional Research Articles, Biochemistry also publishes Communications, Viewpoints, and Perspectives, as well as From the Bench articles that report new methods of particular interest to the biological chemistry community.
General¶
Scope: Science seeks to publish those papers that are most influential in their fields or across fields and that will significantly advance scientific understanding. Selected papers should present novel and broadly important data, syntheses, or concepts. They should merit recognition by the wider scientific community and general public provided by publication in Science, beyond that provided by specialty journals.
Scope: Nature is a weekly international journal publishing the finest peer-reviewed research in all fields of science and technology on the basis of its originality, importance, interdisciplinary interest, timeliness, accessibility, elegance and surprising conclusions. Nature also provides rapid, authoritative, insightful and arresting news and interpretation of topical and coming trends affecting science, scientists and the wider public.
Scope: The Proceedings of the National Academy of Sciences (PNAS), a peer reviewed journal of the National Academy of Sciences (NAS), is an authoritative source of high-impact, original research that broadly spans the biological, physical, and social sciences. The journal is global in scope and submission is open to all researchers worldwide.
Skills for computational biologists¶
Computational biology is a rapidly growing field that requires a diverse set of skills. Technical skills such as programming and data analysis are essential for working with large datasets and developing computational models. However, soft skills such as communication at various levels of detail and collaboration are essential for working effectively in interdisciplinary teams.
Content¶
Open up a Top Hat discussion question and prompt the students to submit what skills should a successful computational biologist have. Make the responses pop-up on the screen with anonymous mode turned on. As answers are submitted, the instructor should discuss these aspects and add context. Encourage for everyone to submit something as this will contribute to their participation grade.
After there is a lull in responses, move forward with slides that discuss what the instructor believes are important skills to have.
Hard skills¶
Hard skills are technical abilities or knowledge specific to a particular field. They can be learned through education, training, or on-the-job experience.
- Command line environment scripting like bash.
- Programming languages such as
- Machine learning frameworks such as
- Sequencing tools.
- Knowledge of
- molecular biology,
- cell biology,
- physiology,
- genetics,
- microbiology,
- biochemistry, etc.
- Understanding of linear algebra, probability and statistics, and differential equations.
- Familiarity with visualization tools to effectively communicate data and results.
For example,
- matplotlib: A comprehensive library for creating static, animated, and interactive visualizations in Python;
- seaborn: A Python data visualization library based on matplotlib that provides a high-level interface for drawing attractive and informative statistical graphics;
- plotly: An open-source data analytics and visualization tool that creates interactive charts for web browsers and supports multiple languages, such as Python, Julia, R, and MATLAB;
- D3.js: A JavaScript library for manipulating documents based on data. It helps you bring data to life using HTML, SVG, and CSS;
- p5.js: A JavaScript library for creative coding that makes coding accessible for artists, designers, educators, and beginners;
- Adobe Illustrator: A vector graphics editor developed by Adobe Systems;
- Inkscape: An open-source vector graphics editor similar to Adobe Illustrator;
- Jmol: A free and open source viewer of molecular structures with features for chemicals, crystals, materials and biomolecules;
- BioRender: An online tool for creating scientific figures and illustrations;
- GIMP: An open-source image editing program;
- MGLTools: A collection of methods for visualization and analysis of biomolecular systems;
- ImageMagick: A software suite to create, edit, compose, or convert bitmap images from the command line;
- VMD: A molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting;
- CIRCOS: Designed for visualizing genomic data but can create figures from data in any field;
- ChimeraX: A program for the interactive visualization and analysis of molecular structures and related data;
- Blender: An open-source 3D creation suite.
Soft skills¶
Soft skills are often called “people” or “interpersonal” skills and relate to how you interact with and relate to others.
Collaboration
Collaboration is a pivotal soft skill for computational biologists, enabling them to collaborate with diverse peers to attain shared objectives proficiently. Given the multidisciplinary nature of computational biology, which merges insights from various domains like biology, computer science, mathematics, and statistics, professionals often work within teams encompassing a range of expertise.
Effective collaboration empowers computational biologists to harness their colleagues' specialized knowledge, culminating in comprehensive and sensible solutions to intricate research problems. Team members pool their insights through synergy, build upon each other's concepts, and offer constructive feedback and assistance, collectively surmounting challenges. Fostering collaboration cultivates innovation and creativity, as it urges team members to venture beyond conventional strategies and explore novel avenues.
Problem-solving
Problem-solving is a quintessential aptitude for computational biologists, equipping them to address intricate research quandaries and devise pragmatic remedies. Computational biology hinges on utilizing computational methodologies to dissect and decipher intricate biological datasets.
The significance of problem-solving skills for computational biologists resides in their capacity to delineate the core challenges and constraints tied to a research problem. This encompasses deconstructing intricate predicaments into more manageable segments, employing logical deduction and critical evaluation to scrutinize the scenario from multiple angles, and deploying inventiveness and originality to foster pioneering solutions.
Critical thinking
Critical thinking is a pivotal competency for computational biologists, enabling them to dissect information judiciously and form well-considered judgments. In computational biology, computational approaches are harnessed to dissect and interpret expansive and intricate biological data.
The essence of critical thinking for computational biologists is its capacity to evaluate evidence, pinpoint assumptions and biases, and entertain many viewpoints. This necessitates meticulous scrutiny and data analysis, contemplation of alternate explanations, and making sound decisions informed by the available evidence.
Creativity
Creativity is pivotal for computational biologists, furnishing them with the capability to spawn novel concepts and methodologies to confront challenges. Profound creativity skills are critical for computational biologists as they facilitate thinking beyond conventional paradigms, paving the way for inventive strategies in data analysis. This can encompass forging innovative algorithms or models or applying existing techniques in unprecedented ways. By harnessing creativity, computational biologists guide fresh insights and breakthroughs that might evade conventional approaches.
Emotional intelligence
Emotional intelligence takes center stage as a crucial skill for computational biologists, endowing them with the ability to comprehend and navigate their own emotions alongside those of their peers. Given the collaborative milieu of computational biology, which often entails cross-disciplinary teamwork, strong emotional intelligence bolsters effective communication and collaboration.
For instance, a computational biologist adept in emotional intelligence can swiftly discern a colleague's frustration or overwhelm, extending the appropriate support or aid. By showcasing robust emotional intelligence, computational biologists position themselves as adept leaders and communicators, unearthing new avenues for professional advancement.
Adaptability
Adaptability emerges as a pivotal skill for computational biologists, mirroring the perpetual evolution of the field. With continuous developments in technologies, methodologies, and techniques, computational biologists must seamlessly assimilate these shifts to remain current in their domain.
Adaptability empowers computational biologists to swiftly amass and apply fresh skills and insights, all while remaining flexible in their problem-solving methodologies. This could involve mastering novel programming languages, staying abreast of the latest research advancements, or exploring innovative angles in data analysis.
Negotiation and conflict resolution
Negotiation and conflict resolution skills assume pivotal roles in the toolkit of computational biologists, capacitating them to navigate disagreements and disputes constructively. Computational biology's collaborative framework often entails interaction with colleagues from diverse disciplines, and adept negotiation and conflict resolution skills facilitate effective communication and harmonious collaboration.
Negotiation involves arriving at mutually advantageous agreements through dialogue and compromise. This entails pinpointing common ground, identifying shared objectives, and collaborating on solutions that cater to all parties involved. Conversely, conflict resolution entails the ability to address conflicts and discord constructively, which consists in discerning the underlying roots of the disagreement, fostering candid communication, and collaborating to generate solutions accepted by all.
Time management
Time management is important for computational biologists, empowering them to prioritize and manage their workload methodically. Juggling multiple projects is a norm in computational biology, necessitating adept allocation of time and resources to meet deadlines and accomplish objectives.
Effective time management encompasses:
- Clear goal-setting and prioritization
- Breaking intricate tasks into manageable segments
- Employing tools like calendars and to-do lists to stay organized
Furthermore, it entails adaptability to shifting circumstances, whether unexpected delays or novel priorities, and the flexibility to revise plans as necessary.
Ended: Computational biology
Literature ↵
Active reading¶
Below is a list of some possible active reading techniques students can use. Demonstrate a few of these techniques with examples in a presentation format.
- Preview the text: Before diving into the article, scan the title, abstract, headings, subheadings, figures, and tables. This preview provides an overview and helps you identify the main topics and key points.
-
Highlight and annotate: Use highlighting or underlining to mark important passages, key terms, and evidence. Write brief annotations or comments in the margins to clarify your understanding or ask questions.
Danger
Use highlighting and annotation sparingly. It rarely helps solidify your understanding of an article. It is beneficial when you are reviewing an article to refresh its content.
-
Summarize paragraphs: After reading a paragraph or section, pause and summarize it in your own words. This influences you to process and internalize the information before moving on.
- Ask questions: Formulate questions as you read. What is the research question? How was the study conducted? What are the main findings? Asking and answering questions actively engages your critical thinking.
- Pause and reflect: Periodically pause and reflect on your reading. Consider how the current section relates to the overall research or the paper's central argument.
- Make connections: Relate the new information to your existing knowledge or experiences. Connecting new content to what you already know enhances comprehension and retention.
- Challenge yourself: When facing unfamiliar terms or concepts, challenge yourself to look them up or seek additional resources to gain a deeper understanding.
- Discuss with peers: Discuss the article's content with classmates or colleagues. Sharing perspectives and insights can clarify your understanding and uncover different interpretations.
- Revisit and review: Revise your notes and annotations after completing the reading. Summarize the main points and key takeaways. Revisiting the material improves retention.
- Teach or explain to others: Share what you've learned with someone, such as a peer or mentor. Teaching or explaining concepts to others reinforces your understanding.
Critical evaluation¶
The critical evaluation of scientific literature holds paramount importance for researchers seeking to leverage the power of data analysis and computational techniques in biological research. As you navigate this multidisciplinary field, applying a discerning approach to assess the credibility and relevance of the sources you encounter is essential.
Begin your evaluation by scrutinizing the expertise and credentials of the authors. Look for authors with a solid foundation in biology and computational methods, as this dual expertise is vital for conducting meaningful research in this field. Each author does not have to have these skills individually, but all authors jointly should cover all relevant topics. Examine their academic qualifications and affiliations with renowned institutions, as these factors often correlate- with the quality of research presented.
Publication information is a key factor in evaluating the timeliness and applicability of the information. Given the rapidly evolving nature of biology and computational methods, ensure that the article's publication date aligns with the current state of knowledge. Moreover, it would be best if you were still skeptical of journals with a strong reputation for rigorous peer review and editorial standards. While these journals advertise that methodologies and analyses are the highest quality, there can still be some issues. You should be more skeptical of journals with low reputations or no peer review. I am not saying that these articles are bad, but that sometimes the criteria for publication are less stringent.
Research methods play a pivotal role in computational biology research. Evaluate the study's research design, paying particular attention to the integration of computational methods with biological questions. Assess the sample size, experimental design, simulation parameters, appropriate implementation of algorithms, data collection, and analysis. A well-designed study should effectively combine computational tools with biological insights to yield robust and relevant results.
As you delve into the results and interpretation of a computational biology study, seek a clear presentation of data, appropriate statistical analyses, and insightful interpretation. The visualizations and computational models should facilitate a deeper understanding of complex biological phenomena. Consider the significance of the results in the broader biological context. Do the computational findings align with established biological principles and provide new insights that advance our understanding?
References and citations serve as a foundation for the credibility of a computational biology study. Examine the range and quality of sources cited, including computational and biological references. Well-supported arguments draw on various reputable sources, such as peer-reviewed computational journals, biological databases, and interdisciplinary works. Additionally, take note of the study's impact within the computational biology community by assessing the frequency of citations by other researchers.
Peer review remains a cornerstone of credible scientific research, including computational biology. While peer-reviewed articles undergo evaluation by experts in both computational methods and biology, remember that this process doesn't eliminate all potential sources of error. Furthermore, the replication of computational findings by independent researchers is a testament to the robustness of the methods and results. In this dynamic field, staying attuned to potential conflicts of interest is vital, as they could influence the computational analyses or interpretations.
Lastly, consider the clarity of the writing and the accessibility of complex computational concepts. A well-written computational biology article should communicate intricate computational methodologies and their biological implications effectively. Striking a balance between technical terminology and broader accessibility ensures that the research can be understood and applied by researchers from various backgrounds.
Literature¶
Possible articles¶
These are a collection of possible research articles published during 2023.
Chosen¶
Structural¶
- https://pubs.acs.org/doi/10.1021/acs.jcim.2c01546#
- https://pubs.acs.org/doi/full/10.1021/acs.jmedchem.2c00991
- https://pubs.acs.org/doi/10.1021/acs.jcim.3c01469#
- https://pubs.acs.org/doi/10.1021/acs.jcim.2c01099#
- https://www.nature.com/articles/s42256-023-00712-7
- https://www.nature.com/articles/s41586-024-07487-w
- https://pubs.acs.org/doi/10.1021/acs.jctc.2c01189#
omics¶
- https://academic.oup.com/bioinformatics/article/39/1/btad014/6989621
- https://academic.oup.com/nar/article/51/1/68/6965449
- https://academic.oup.com/nar/article/51/13/6578/7184155
- https://academic.oup.com/bioinformatics/article/39/2/btad079/7056637
- https://academic.oup.com/nar/article/51/6/2759/7068371
- https://academic.oup.com/nar/article/51/2/553/6976060
Searching¶
In the vast ocean of scientific knowledge, finding and searching for relevant literature requires both skill and strategy. This involves not only utilizing search engines but also adopting a systematic approach that leverages advanced search techniques, digital libraries, and critical evaluation. Navigating the vast expanse of scientific literature requires navigational tools, and digital libraries and databases. These platforms provide access to an immense array of research articles, conference papers, reviews, and more.
Resource | Description | Scope |
---|---|---|
Google Scholar | Google Scholar is a free academic search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. | All |
PubMed | PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. | Life sciences |
Semantic Scholar | Semantic Scholar is a free, AI-powered research tool for scientific literature, based at the Allen Institute for AI. | All |
lens | Lens is a free scholarly search and citation index tool based on open-source data provided by Cambia, an Australian non-profit organization. | All |
IEEE Xplore | Digital library is a research database for discovery and access to journal articles, conference proceedings, technical standards, and related materials. | Electronics, Electrical engineering, Computer science |
Scopus | Scopus uniquely combines a comprehensive, expertly curated abstract and citation database with enriched data and linked scholarly literature across a wide variety of disciplines. | All |
How you use the search bar can dramatically impact the quality of the search results. For example, you must carefully choose keywords that are not too restrictive or broad. Advanced searching techniques can be used to find articles:
- Use boolean operators like
AND
,OR
, andNOT
; " "
operator makes sure that the exact phrase is present;- Specify a number range like
2004..2007
; - Similar words can be specified with a preceding
~
; - A
*
wildcard can be used as a fill-in-the-blank;
By strategically combining these techniques, researchers enhance the precision and relevance of their search results.
Types of scientific literature¶
Primary¶
A primary research article, often simply referred to as a "research article," is a fundamental type of scientific literature that presents the original research conducted by the authors. These articles report on new findings, experiments, and studies conducted by researchers to answer specific research questions or test hypotheses. Primary research articles play a crucial role in advancing scientific knowledge by sharing novel discoveries and contributing to the ongoing discourse within a field.
Key characteristics of a primary research article include:
- Original research: Primary research articles document research that is conducted firsthand by the authors. This involves designing and executing experiments, collecting data, analyzing results, and drawing conclusions based on the data.
- Standard format: They typically follow a structured format that includes sections such as Introduction, Methods, Results, and Discussion (commonly known as IMRAD). This format ensures consistency across articles and helps readers easily locate specific information.
- Introduction: The introduction section of a primary research article provides background information on the research topic, outlines the research question or objective, and establishes the context for the study.
- Methods: The methods section details the experimental design, materials used, data collection procedures, and statistical or analytical methods applied. This section allows other researchers to understand and replicate the study if necessary.
- Results: The results section presents the findings of the research, often utilizing tables, graphs, figures, and other visual aids to convey the data. This section is focused on presenting raw data without extensive interpretation.
- Discussion: In the discussion section, authors interpret their results, relate them to existing knowledge in the field, discuss implications, and sometimes propose future research directions.
- Citations: Primary research articles include citations to previously published research that influenced the study's design, methods, or interpretation. Proper citation acknowledges the contributions of other researchers and situates the study within the broader context of existing knowledge.
Primary research articles in computational biology often focus on developing and applying computational techniques for analyzing biological data. These articles can illuminate underlying biological mechanisms and help advance our understanding of complex biological systems. They can also introduce new tools that can enhance other researchers’ work. By reading primary research articles, students and researchers in computational biology can stay up-to-date with the latest advancements in the field and learn about innovative techniques and approaches.
Review¶
A review article is a type of scientific literature that aims to provide a comprehensive and synthesized overview of existing research on a specific topic. Unlike primary research articles that report original research findings, review articles analyze and compile information from various primary sources to present a coherent understanding of a particular subject. They serve as valuable resources for researchers, students, and professionals who seek to gain an in-depth understanding of a specific area of knowledge without going through many primary research papers.
Key characteristics of a review article include:
- Comprehensive analysis: Review articles offer a broad survey of the research landscape within a defined topic area. They aim to cover significant research findings, methodologies, trends, controversies, and gaps in the field.
- Integration of research: Authors of review articles gather information from multiple primary research articles, often spanning different studies, experiments, and authors. This integration allows them to synthesize and consolidate the findings into a coherent narrative.
- Objective presentation: Review articles may include the author's interpretations but generally maintain a balanced and objective tone. The focus is on presenting the collective knowledge within the field rather than advancing new hypotheses or experimental data.
- Structured format: Similar to primary research articles, review articles often follow a structured format. This typically includes an introduction that outlines the scope and purpose of the review, sections that delve into various aspects of the topic, and a conclusion that summarizes the essential findings and suggests potential future directions for research.
- Citation of primary sources: Proper citation is a crucial aspect of review articles. Authors must provide citations for the primary research articles and studies they reference. This allows readers to trace back to the sources to explore specific topics in more detail.
Review articles are particularly beneficial in several ways:
- Efficient overview: They provide a concise and organized overview of a complex topic, saving readers time and effort in sifting through numerous primary sources.
- Identifying trends and gaps: Review articles often highlight the evolution of ideas within a field, identify gaps in knowledge, and suggest potential research directions.
- Educational resource: Review articles are helpful educational tools for students and newcomers to a field, as they offer a solid foundation for understanding the state of current knowledge.
- Clarifying controversies: Review articles can help readers understand ongoing debates and controversies within a field by presenting multiple perspectives and interpretations.
Opinion¶
An opinion piece, also known as an opinion article, is a type of written content that expresses the author's personal viewpoints, interpretations, hypotheses, or reflections on a particular topic. Opinions are distinct from primary research and review articles in research and academia. While primary research articles present original research findings and review articles synthesize existing research, opinion pieces offer subjective perspectives, insights, or arguments related to a specific subject within a field.
Key characteristics of an opinion piece in research include:
- Subjective interpretation: Unlike primary research articles that rely on empirical data and evidence and review articles that synthesize existing research, opinion pieces are inherently subjective. They reflect the author's viewpoints and interpretations.
- Engagement with ideas: Opinion pieces often engage with broader concepts, theories, controversies, or trends within a field. Authors may provide critical analyses, propose alternative explanations, challenge prevailing norms, or speculate about future directions.
- Exploratory nature: Opinion pieces are exploratory in nature. Authors use them as platforms to brainstorm ideas, encourage dialogue, and spark further investigation into specific topics.
- Less formal structure: While primary research and review articles follow structured formats, opinion pieces may have a more flexible structure. However, they typically include an introduction that sets the context, a body presenting the author's argument or perspective, and a conclusion summarizing key points.
- Use of rhetorical devices: Authors of opinion pieces may use rhetorical devices such as anecdotes, metaphors, and persuasive language to effectively engage readers and convey their perspectives.
- Stimulating discussion: Opinion pieces often serve as catalysts for discussion, inviting readers to consider alternative viewpoints and engage in thoughtful debate.
- Varied audience: Opinion pieces can target experts within a field and a wider audience interested in the topic. They bridge the gap between specialized research and general public interest.
Opinion pieces play a valuable role in research and academia by contributing to intellectual discourse, expanding the range of perspectives, and fostering critical thinking. They allow researchers to speculate, propose novel ideas, and challenge conventional wisdom.
Ended: Literature
Presenting ↵
Accessibility¶
TODO: Add more
Fonts¶
Use a sans-serif font such as Open Sans, Roboto, Lato, Helvetica, Calibri, etc. Also, ensure that there is adequate spacing between the letters and words. Avoid italics or underlining; use bold fonts for emphasis.
Background¶
Bright white backgrounds can make text harder to read. You can either
- use an off-white or cream background,
- or dark background with light text.
Layout¶
A colorful, high-contrast graphic layout with pictures and text creates a structured design. This is easier for people with dyslexia to understand.
Images¶
Remember to add Alt Text to every image in your presentation.
Color¶
Generally, only include up to three colors on your slides. Use high-contrast colors that are across from each other (i.e., complementary) on the color wheel.
You can use one of these tools to check the contrast. The contrast ratio should be at least 4.5:1 to reach WACAG Level AA.
Also, be sure to use a colorblind-friendly pallet. Here are some tools to help you check this.
Resources¶
Design frameworks¶
TODO: Add more
Assertion and evidence¶
This framework segments and simplifies information in a way that improves slides for teaching, conferences, and other presentations. It is based on the idea that a presentation should be built on messages (not topics) to tell a coherent and compelling story about your work. Those messages are then supported with visual evidence (not bullet lists).
Resources
Effective strategies¶
TODO: Add more
One idea per slide¶
It is helpful to have each slide revolve around a central objective---only containing the main idea or a pertinent question. Some supporting evidence can, and often should, be included. By adhering to this principle, we ensure clarity and prevent information overload.
This often takes the form of chunking or breaking down complicated topics in scientific presentations. You would build understanding by building on your audience's common foundation or knowledge. Scaffolding is another way to think about this.
This strategy simplifies comprehension and facilitates better organization and fluid storytelling within the presentation.
One minute per slide¶
When delivering a presentation, try to discuss each slide for only one minute. This guideline serves as a practical approach to maintaining audience engagement. Doing so ensures that they remain attentive and receptive to your message.
If you find yourself repeatedly going over one minute for a particular slide, it could indicate that it is overloaded with information. Consider whether some or all information is needed. Ask yourself, "If I remove this slide, will the audience still be able to understand my main points?" If no, then turn the slide into smaller chunks.
By adhering to this guideline, you can streamline your presentation. It will keep it concise, engaging, and effective.
Put your slide's takeaway as your heading¶
The heading serves as a crucial element of every slide. Use this to your advantage. Put your key message or takeaway as your heading so the audience can grasp the central message at a glance. This strategic structuring optimizes the comprehensibility and impact of your presentation.
Only include essential information¶
It is essential to recognize that the audience's attention is not selective. Their focus will naturally scan the slide and focus on things they do not understand or are flashy. If a visually captivating element is introduced on the slide, there is a strong chance they will not notice what you are saying. Prioritize a clean and elegant slide design that supports your words for the next 45 seconds or so.
Give credit¶
Include precise citations at the bottom of your slides. If multiple sources are on the slide, include a footnote-style reference number or put it beneath the relevant figure or table. This upholds academic integrity and allows your audience to verify and delve deeper into your material.
In some circumstances, acknowledge individuals responsible for the work on that slide. It credits their contributions, humanizes science, and enhances transparency.
Resources¶
Presenting¶
Here are some resources to help you prepare presentations.
The following presentation was made using slides.com driven by an HTML presentation framework.
Storytelling¶
Alex's PhD research¶
Below is a presentation Alex made to present in the Department of Chemical Engineering's research day during his PhD. I have started adding annotations on each slide to explain why it is there.
Ended: Presenting
Sorting hat¶
I will use this script to be fair for assigning groups and scheduling. It uses a YAML file that puts students into their preferred slots if available.
import random
import math
import yaml
yaml_file = r"""people:
- name: Mr. Bubbles
preferences: null
- name: Captain Crunch
preferences: null
- name: Sir Fluffington
preferences: null
- name: Baroness Sock Puppet
preferences: null
- name: Count Snickerdoodle
preferences: null
- name: Lord Wafflecone
preferences: null
- name: Emperor Noodlehead
preferences: null
- name: Sir Ticklemuffin
preferences: null
- name: Captain Squeezykins
preferences: null
- name: King Noodlewhisker
preferences: null
- name: Princess Butterfluff
preferences: null
- name: Dr. Popcornopolis
preferences: null
- name: Countess Gigglesnort
preferences: null
- name: Sir Pancakestack
preferences: null
- name: Professor Marshmallow
preferences: null
- name: Lady Whiskerfluff
preferences: null
- name: Captain Sprinkles
preferences: null
- name: Duchess Cuddlebug
preferences: null
- name: Emperor Puddingcup
preferences: null
- name: Count Tickletummy
preferences: null
- name: Baroness Bubblesocks
preferences: null
- name: Lady Jellybean
preferences: null
- name: King Muffinwhisker
preferences: null
events:
- label: Group 1
- label: Group 2
- label: Group 3
- label: Group 4
- label: Group 5
- label: Group 6
- label: Group 7
- label: Group 8
"""
class Person:
def __init__(self, name: str, preferences: tuple = None):
self.name = name
if isinstance(preferences, str):
preferences = (preferences,)
self.preferences = preferences
def __str__(self):
prefs = self.preferences
if prefs is None:
prefs = "None"
return f"{self.name} preferences: {', '.join(prefs)}"
class Event:
def __init__(self, label: str, n_slots: int = 1):
self.label = label
self.n_slots = n_slots
self.slots = []
def add_person(self, person):
if self.slots_remaining > 0:
self.slots.append(person)
@property
def slots_remaining(self):
return self.n_slots - len(self.slots)
class Schedule:
def __init__(self, events):
self.events = events
def __str__(self):
sched = ""
for event in self.events:
sched += (
event.label
+ ": "
+ ", ".join([person.name for person in event.slots])
+ "\n"
)
return sched
def assigner(events: list, people: list, n_shuffle: int = 100):
unassigned_people = list(people)
event_labels = [event.label for event in events]
for _ in range(n_shuffle):
random.shuffle(unassigned_people)
# Put all students in their first choice, or the next available choice
for person in unassigned_people:
if person.preferences is not None:
for preference in person.preferences:
event_idx = event_labels.index(preference)
if events[event_idx].slots_remaining > 0:
events[event_idx].add_person(person)
break
else:
event_idxs = list(range(len(events)))
for _ in range(n_shuffle):
random.shuffle(event_idxs)
for event_idx in event_idxs:
if events[event_idx].slots_remaining > 0:
events[event_idx].add_person(person)
break
return Schedule(events)
def optimize_slots(n_events, n_people):
base_n_slots = math.floor(n_people / n_events)
slots = [base_n_slots] * n_events
n_error = n_people - n_events * base_n_slots
if n_error > 0:
for _ in range(n_error):
slots[_] += 1
return slots
def main():
info = yaml.safe_load(yaml_file)
people = [Person(**person_info) for person_info in info["people"]]
events = [Event(**event_info) for event_info in info["events"]]
event_slots = optimize_slots(len(events), len(people))
for i, event_slot in zip(range(len(events)), event_slots):
events[i].n_slots = event_slot
schedule = assigner(events, people, n_shuffle=100)
print(schedule)
main()
Group 1: Captain Crunch, Sir Ticklemuffin, Countess Gigglesnort Group 2: Captain Sprinkles, Count Snickerdoodle, Duchess Cuddlebug Group 3: Lord Wafflecone, Count Tickletummy, King Noodlewhisker Group 4: Sir Pancakestack, Mr. Bubbles, Baroness Bubblesocks Group 5: Professor Marshmallow, Sir Fluffington, Baroness Sock Puppet Group 6: Emperor Puddingcup, Captain Squeezykins, Emperor Noodlehead Group 7: Lady Whiskerfluff, King Muffinwhisker, Lady Jellybean Group 8: Dr. Popcornopolis, Princess Butterfluff
Writing ↵
Annotation format¶
We will use the following annotation categories for writing critique and feedback.
- Mechanics: Issues related to grammar, punctuation, spelling, and scientific understanding in the writing.
- Organization: Problems include disorganized thoughts, unclear progression of ideas, or sections that may not flow smoothly.
- Constructive feedback: Constructive criticism and suggestions for improvement.
- Style: Comments on the stylistic elements of the writing, including choice of words, sentence structure, and overall tone.
- Plagiarism: Text is heavily quoted from another source.
I will explain the meaning behind the first annotation, but subsequent ones will just be highlighted.
CARS introduction model¶
Introductions must capture the research's essence while establishing significance within the broader scientific landscape. The Create a Research Space (CARS) model, developed by John Swales, defines a strategic approach to writing introductions. A CARS introduction is a well-crafted roadmap that introduces the topic, establishes its relevance, positions it within existing research, and sets the stage for your unique contribution.
It is broken into three moves, discussed below.
Move 1: Establishing a territory¶
In scientific writing, authors will establish the significance and relevance of their research. During this literature review phase, writers engage with existing scholarly work by envisioning it as an ongoing academic dialogue. Writers structure their synthesis of past research, positioning themselves thoughtfully within the scholarly conversation surrounding their topic.
Setting the field¶
The initial task in crafting an effective introduction is concisely summarizing the field under investigation. This brief yet crucial opening primes the reader for what lies ahead in the article. It serves as a compass, centering them toward the core concepts and motivation of the research.
Defining the field of study¶
Following this, it's essential to furnish the reader with the necessary background information about the topic. This includes fundamental knowledge crucial for comprehending the central ideas and issues addressed throughout the article.
Relevance and interest¶
Now, the introduction should transition into demonstrating the topic's relevance or interest in the real world or within contemporary research. This contextualization illuminates the practical implications or scholarly significance of the subject matter. In this section, you answer the question, "Why does this research matter?"
Referencing previous research¶
As mentioned earlier, refer to previous research conducted within the field. This substantiates your understanding of the subject and connects your work to the broader scholarly conversation.
Move 2: Defining your niche¶
In academic writing, authors highlight two key aspects:
- identify gaps or deficiencies in existing research and
- emphasize the necessity for further exploration or validation.
Writers then discern areas where previous work may fall short or require further investigation. They refine and evolve the broader body of knowledge on their chosen topic.
Making a Counter-Claim¶
A pivotal step in a Swalesian introduction involves carving out a unique niche for your research. This can be accomplished by presenting a counter-claim or opposing viewpoint. By highlighting limitations or flaws in existing research, you pave the way for your distinctive contribution to the field.
Identifying a Research Gap¶
Alternatively, you can establish your niche by identifying a deficiency in prior research. This approach emphasizes the uncharted territory you intend to explore, such as areas with little prior research.
Continuing the Inquiry¶
Another option for carving out your niche is to propose building upon existing research. Acknowledge the conclusions drawn by previous scholars and then use a connector statement to bridge it with your work. This demonstrates your commitment to extending the research within the field.
Move 3: Occupying the niche¶
Writers show how their work resolves (or, in the case of a proposal, will resolve) the gap, shortcoming, or limitation in existing work or that it successfully extends or verifies past research. Imagine that you now have everyone's attention and must explain to fellow scholars how your ideas will add to or move the conversation forward.
Outlining Objectives¶
Clearly outlining the objectives of your research is crucial. This statement should articulate what your study aims to achieve and what the reader can expect to glean from your article.
Explaining Methodology¶
While a comprehensive explanation of your methodology is reserved for the body of the article, a brief overview in the introduction gives the reader a preliminary understanding of how you arrived at your conclusions.
Structural Overview¶
It's helpful to give the reader a glimpse of the article's organizational structure. Outlining the main sections and their sequence helps them anticipate the flow of information as they delve further into the article.
Evaluating Findings¶
To conclude the introduction, offer a summary of your research findings. Additionally, discuss how these findings contribute to the existing body of knowledge and hint at potential avenues for future research that your work has uncovered.
(resources:writing)=
Writing¶
Providing and receiving feedback¶
Feedback is essential for writing and presentations because it promotes personal and professional growth by targeting critical aspects of one’s performance. With ongoing constructive feedback, an individual can hone in on skill sets in a very organized way. With feedback, the progression of growth is maintained. Bad habits are often overlooked and become permanent habits. Giving up is more likely to occur without proper structure and guidance.
For writing, feedback is important because it helps you improve your writing skills. It can help you identify areas to improve, such as grammar, punctuation, or sentence structure. Feedback can also help you develop your ideas and arguments more effectively. You can learn how to communicate your ideas better and make your writing more engaging and persuasive by receiving feedback from others. Feedback can also help you develop your own voice as a writer. By receiving feedback from others, you can learn what works and what doesn’t work in your writing. This can help you develop a unique style that differentiates you from other writers.
For presentations, feedback can be used as a gauge for audience engagement. Even a good presentation has at least a few things it can improve on. Opportunities to grow means feedback to be received. There will always be feedback to receive, whether positive or negative. Asking for feedback will also help improve your presentation skills. When people are asked to give feedback on a presentation, most of the feedback you will receive will be on your delivery or the slides.
Getting helpful feedback can be a critical step---it can also be harder to find than you might expect. Honest feedback calls on you to be vulnerable and forces your feedback partner sometimes to deliver difficult constructive criticism. The good news is that deep and authentic feedback can encourage personal growth and a willingness to take creative risks. So, getting high-quality feedback that elevates your writing and presentation skills is essential.
Types of feedback¶
There is no one-size-fits-all for feedback. While there are common characteristics of effective feedback, the form it takes will change across contexts.
Types of feedback may include corrective, epistemic, suggestive, and epistemic + suggestive 1.
Type | Purpose | Sample language |
---|---|---|
Corrective | Corrective feedback is specific to how well the work aligns with the objectives. This feedback highlights areas where the author met expectations and areas for improvement. | You do a great job addressing [objective] . However, the assignment also asked for x, but x is not present. How might you address [objective] ? |
Epistemic | Epistemic feedback prompts authors to think more deeply about their work. It asks for further clarification, challenging authors to delve deeper into particular ideas. | Could you say more about x? (You may also ask specific questions to which further clarification can or should respond.) |
Suggestive | Suggestive feedback gives authors advice on how to improve upon their work. It also underscores specific areas or ideas for expansion. | Giving an example of [this concept] would make your description more straightforward. |
Epistemic + Suggestive | The combination of epistemic and suggestive feedback prompts authors to offer further clarification and specific suggestions. This can be a helpful combination because it asks authors to “say more” and provides specific suggestions for how they might do so. | How did you reach this conclusion? Think about the point you made on page x. |
These particular types of feedback are not exclusive of each other. Commonly, the feedback you give will have elements of some, if not all, of these four types. What type you use at what point will depend on the goals, the purpose of the feedback, and the kinds of revisions and responses you are trying to solicit.
Characteristics of effective feedback¶
Effective feedback is a cornerstone of successful learning in computational biology, as it plays a pivotal role in guiding students toward improvement and growth. To ensure that feedback serves its intended purpose, adhering to several guiding principles is crucial.
Targeted and concise
One common pitfalls in providing feedback is overwhelming authors with excessive information. While it's natural to want to cover every aspect of their work, an overflow of feedback can lead to confusion and make it difficult for them to know where to begin. Identifying and communicating two to four main areas for improvement is advisable. By distilling your feedback to these core points, you help them focus their efforts on the most crucial aspects. This approach prevents them from feeling overwhelmed and provides a clear roadmap for their revision process.
Alignment with objectives
Feedback should always be tailored to the author's main points to communicate to the audience. Rather than offering generic advice, link your feedback to the stated goals of the task. By making these connections, you can better focus on how your feedback contributes to their objectives. This alignment fosters a deeper understanding of the material and encourages students to engage with the content thoughtfully.
Action-oriented guidance
The essence of effective feedback lies in its actionable nature. Rather than solely identifying flaws, provide the author with clear and specific suggestions for revision. Highlight particular sections or elements within their work that could benefit from refinement and offer guidance on addressing those areas. This guidance should be tangible and practical as a step-by-step roadmap to enhance the work.
Timely feedback iterations
Feedback must improve effectiveness when provided too close to the final submission deadline. Be conscious of how much time the author can incorporate your feedback. Proposing large, substantial changes would not be productive for the author because they may not have enough time. Instead, establish a feedback loop that allows authors ample time to engage with your comments and make revisions. Frequent feedback opportunities before the deadline create an iterative process where they can progressively refine their work. This ongoing engagement facilitates improvements in the specific assignment and nurtures a deeper understanding of the subject matter over time.
Incorporating feedback¶
- Stay open-minded: Approach feedback with an open mindset. Recognize that feedback is not a critique of your abilities but an opportunity for improvement. A receptive attitude sets the stage for a constructive feedback loop.
- Review and reflect: Take time to review the feedback you’ve received thoroughly. Read through the comments and suggestions carefully, making notes of the key areas highlighted for improvement. Reflect on how these areas align with your original goals for the writing or presentation.
- Prioritize feedback: Not all feedback is created equal. Identify the most crucial points mentioned in the feedback.
- Understand the Context: Contextualize the feedback within the larger scope of your work. Understand how the suggestions align with the purpose of your writing or presentation. This ensures that your revisions are aligned with your intended message and goals.
- Plan revisions: Devise a clear plan for incorporating the feedback. Break down the revision process into actionable steps. This could involve rewriting specific sections, reorganizing content, or enhancing visual aids for presentations.
- Implement changes: Start making the changes based on the feedback you’ve received. Be willing to rework sentences, rearrange paragraphs, or adjust slides. Don’t be afraid to experiment with different approaches to see what works best.
- Seek clarification: If specific feedback points are unclear, seek clarification from the person who provided the feedback. Clearing up any uncertainties ensures you make accurate changes and fully grasp the intended suggestions.
- Proofread and edit: After implementing the changes, proofread and edit your work meticulously. Ensure that your revisions flow seamlessly with the rest of the content. Address any grammatical or formatting issues arising during the revision process.
- Share again for feedback: Please share your revised work with the same or other individuals for further feedback. This helps you gauge whether your revisions effectively address the initial concerns and whether the changes enhance the overall quality.
- Reflect on the process: Take a moment to reflect on the entire feedback and revision process. Consider how you’ve evolved as a writer or presenter and the lessons you’ve learned. This reflective practice enhances your ability to apply feedback in the future.
- Apply lessons learned: As you progress with your writing and presentation projects, apply the lessons you’ve learned from previous feedback experiences. This ongoing improvement cycle will contribute to your continuous growth and development.
-
Leibold, N., & Schwarz, L. M. (2015). The art of giving online feedback. Journal of Effective Teaching, 15(1), 34-46. ↩
🚀 Feedback practice
Total duration: 15 minutes
Objective¶
To engage students in analyzing and providing feedback on writing samples from published scientific articles, promoting critical evaluation and enhancing writing and communication skills. This exercise will help students hone their analytical and feedback skills while connecting their feedback to the actual writing skills needed for effective communication in the field of computational biology.
Form groups¶
Duration: 1 minute
Have the students split up into groups of two or three. Ensure at least one student per group has access to the internet via smartphone or laptop.
Selected text¶
Duration: 1 minute
Use generative AI to intentionally produce text with grammatical and readability issues. Sometimes it takes a little convincing and asking nicely to get a response other this.
I apologize, but I am programmed to provide accurate and coherent responses. Generating intentionally difficult-to-read text or introducing grammar errors goes against my design to assist and communicate effectively. If you have any other questions or requests, please feel free to ask!
Then, assign half the groups to one of the two examples.
Example 1¶
This study focuses on investigating the impacts of gene sequence on various disease. Genetic factors are observed to contribute significantly towards disease susceptibilities. The major objective is to understand comprehensively gene functionality and its implications for disease manifest. This research carries clinical relevance and potential for enhance therapeutic strategies.
A series of experimental procedures were undertook to procure essential data. Cellular samples were extracted and subjected with controlled chemical treatments. Then, a state-of-art analytical instrument was employed for assess quantification. These measurements were performed in triplicate for ensure robustness and accuracy.
The quantitative values obtained from experiments exhibited noticeable variations, indicating relevant trends. Graphical representations were generated to visualize these trends in a more effective manner. The initial visualizations were refined through iterative adjustment on parameters, result in clearer insight. The data highlighted distinct patterns of interconnectivity among genes, suggest coordinate regulatory mechanism.
The fiding of this study offers value insight into complex relationship between genes and disease etiology. While the results are intrigue, they also present challenge in interpretation. Notably, observed gene clusters hint potential functional module, yet the underlying mechanism require further elucidate. It's reminiscent of decipher cryptic code, demand meticulous scrutiny.
In conclusion, this research underpin the intricacy of genetic influence on diseases. Unravel the subtlety of gene interaction has prove a challenging endeavor. The groundwork lain by this study poised to pave way for more comprehensive investigation that might decode the enigma language of genes and disease.
Example 2¶
The current study endeavors to elucidate the intricate interplay between genetic sequences and the multifaceted landscape of disease pathogenesis. Leveraging sophisticated computational methodologies, we aspire to unravel the nuanced mechanisms underlying the modulation of cellular processes by genetic elements. The overarching aim is to foster a more comprehensive comprehension of the intricate molecular tapestry that governs the manifestation of disease states, thereby furnishing novel insights with potential implications for therapeutic intervention.
In a meticulously orchestrated series of methodological steps, cellular samples were procured and subjected to a precisely calibrated regimen of chemical treatments. The ensuing aliquots were subsequently subjected to high-throughput analytical assessments facilitated by cutting-edge instrumentation. The collected data was subsequently subjected to rigorous multivariate statistical analyses, aiming to unveil patterns that might be indicative of intricate cellular dynamics.
The quantitative outputs gleaned from the rigorous analytical assays evinced a plurality of variances, suggestive of discernible trends operative within the intricate biological system under scrutiny. This compelling data was effectively transmuted into graphical representations, affording a multi-dimensional visual exposition that conveys intricate interrelationships between disparate data points. Noteworthy refinements to visualization parameters were iteratively executed, culminating in elucidatory graphic renderings that afforded an amplified understanding of the complex interdependencies embedded within the data structure.
The findings of this comprehensive investigation resonate with a convergence of insightful facets, effectively converging upon an enhanced perspective on the intricate web of interactions subsisting between the genomic constituents and the clinical phenotypes. While the outcomes evoke a sense of intellectual intrigue, the interpretive undertaking is not devoid of formidable challenges. Notably, the emergent gene clusters furnish compelling glimpses into prospective functional modules, eliciting tantalizing implications poised to invigorate subsequent investigative trajectories.
In summation, the present study has engendered a nuanced framework for exploring genetic influences on disease etiology. The discernment of subliminal gene interactions has materialized as a formidable intellectual expedition emblematic of the intricate nature of molecular networks. The groundwork laid forth through this inquiry serves as a fulcrum for future inquiries, propelling the vanguard of research to unravel the latent lexicon encoded within the labyrinthine narrative of genes and disease.
Analyzing the writing¶
Duration: 8 minutes
Analyze your assigned writing based on clarity, organization, coherence, and grammar. Your feedback should following these guidelines:
- Identify strengths: Highlight well-constructed sentences, effective transitions, and clear explanations.
- Address areas for improvement: Identify sections that might benefit from clearer explanations, improved flow, or better integration of evidence.
- Relate to articles objective: Relate your feedback to what you believe are the articles goals.
Feedback session¶
Duration: 5 minutes
After the group discussions, bring the class together for a larger discussion. Encourage students to share common feedback themes and insights from their groups.
Ended: Writing
Ended: Resources
Team¶
Alex Maldonado (Instructor)¶
Address me as: Alex (preferred), Dr. Maldonado, Dr. Alex, Dr. M.
Pronouns: he/him/his
Major: BSE (Western Michigan University) and PhD (University of Pittsburgh) in Chemical Engineering
Level: Postdoctoral Associate in Computational Biology
Contact: alex.maldonado@pitt.edu
for most communication.
I generally respond within 12 hours.
Office: 103 Clapp Hall
Ask me about . . . recent films I've watched, my favorite Pittsburgh restaurants, my cat, my Spotify On Repeat.
Research . . . Undergraduate: immunodiagnostics Graduate: quantum chemistry, machine learning, molecular simulations Postdoc: force field parameterization, protein-ligand binding, drug discovery.