Assessment FAQs and Terms
Ask the Assessment Team
Contact the Assessment Team at email@example.com with questions or suggestions for this website. As we receive questions we will add them to the Frequently Asked Questions list.
Definitions of common terms
Assessment is any effort to gather, analyze, and interpret evidence which describes institutional, departmental, divisional, or agency effectiveness, and has the purpose of improving student learning and development (Upcraft, M.L, & Schuh, J.H., 1996).
An Assessment Map is a chart that demonstrates the connection between program goals/standards and the Key Assignments used to evaluate program quality. In the GSE, it is in chart form with the program goals/standards listed down the side and the Key Assignments listed across the top. In the column for each Key Assignment, the box is checked if that assignment measures student performance on the corresponding program goal/standard.
Culture of assessment
Culture of assessment is an environment in which continuous improvement through assessment is expected and valued.
Direct measures of student leaning require students to display their knowledge and skills as they respond to the instrument itself. Objective tests, essays, presentations, and classroom assignments all meet this criterion (Palomba, C.A., & Banta, T.W., 1999).
In indirect assessment learning is inferred instead of being supported by direct evidence (i.e., usage data, satisfaction surveys). Students reflect on learning rather than demonstrate it. (Palomba, C.A., & Banta, T.W., 1999)
Inter-rater reliability is the degree to which scorers/raters will agree on the same score for the same sample of work. If the inter-rater reliability is high, there is a high degree of agreement between raters/scorers in terms of what “a score of 3 means,” or what “proficient looks like.” If different raters/scorers will evaluate an assignment differently using the same rubric, the rubric has low inter-rater reliability. Concerns about low inter-rater reliability include:
- Low inter-rater reliability is unfair for students because their score depends on the scorer, not on their true performance.
- Low inter-rater reliability also means that the scores cannot reliably be analyzed for program review.
- Low inter-rater reliability can be alleviated with more descriptive scoring tools/rubrics and increased opportunities for rater/scorer preparation.
A Key Assignment provides useful data about the quality of the program being offered. Each program should select a set of Key Assignments that can be used to evaluate student performance on every program goal. Student performance data gathered from this set of Key Assignments can then be analyzed to determine the program’s ability to meet their program goals.
Learning outcomes are goals /standards that describe how a student will be different because of a learning experience. More specifically, learning outcomes are the knowledge, skills, attitudes, and habits of mind that students take with them from a learning experience.
A Likert scale is an item type used on surveys/other allowing respondents to indicate their level of agreement with a statement by marking their response on a scale (e.g. a 4-point scale of agree/disagree).
Method of assessment
Method of assessment highlights the way(s) in which the learning outcome or program goal will be measured. Multiple methods will yield richer data.
Program Assessment Plan
A program assessment plan is a systematic plan for collecting, analyzing, and reviewing data as a part of continuous program improvement. In the GSE, a complete Program Assessment Plan includes: a list of key assignments with reliable rubrics that are mapped to program goals/standards, evidence of the use of data for program review reported quarterly to the Dean, and an annual reflection shared with the GSE.
Program goals should identify what your graduates/completers will have learned by the end of your program. This may include references to, or a list of, recognized professional standards to which your program must adhere. It would also include the GSE Conceptual Framework and/or other core elements that are considered essential to the program.
Reliability is the consistency of measurement, or the degree to which an instrument measures the same way each time it is used under the same condition with the same subjects. In short, it is the repeatability of a measurement. A measure is considered reliable if a student’s score on the same test given twice is similar. A rubric is considered reliable if two people using the rubric to evaluate the same piece of student work come up with the same score.
A rubric is a scoring tool used to assess student learning. At its most basic a rubric divides an assignment into its component parts and objectives (tied to program goals/standards), and provides a detailed description of what constitutes acceptable and unacceptable levels of performance for each part. Rubrics can be used to grade any assignment or task (Stevens, D. www.introductiontorubrics.com/overview.html)
- When a rubric is agreed-upon and communicated prior to the student's work being completed, the grading process is very clear and transparent to all involved.
- Student scores from rubrics can easily be aggregated to provide information about how well program goals are being met.
Student learning outcome
Student learning outcomes are what students will demonstrate they know or are able to do as a result of the program.
Validity is the concept of agreement between what the assignment is supposed to measure and what it actually measures. As example of low validity would be if an assignment asks students if they know what a computer is and then reports that these students have technical skills if all students say yes. In this case, the assignment measured student’s knowledge of what a computer was rather than their technical skills associated with using a computer.