Research Methods -Psychology - class notes

Research Methods -for psychology - class notes

Research Methods -for psychology - class notes

Back to Website

Class, Oct. 22, 2001

Chapter 1 of book
  1. Basic research –general issue
  2. Applied research –deals w/ specific questions
–i.e skinner =general conditioning = basic.

àLater: specific effects of conditioning – applied

Empiric test –open to retesting – to be proven wrong

àobjective


Scientific method: tries to avoid in distortions made by perception/judgment

-->reduces but doesn’t avoid mistakes of perception/judgment

-need to be able to answer the following:

  1. is it an exact knowledge/science?
  2. Can it be generalized, or is it specific to a specific population
  3. Can it be reproduced independently?

-is there a specific/unique methodology to a social science? (i.e. observation/questionnaire)

-->not really! You can use many diff. kinds of methods

views:

  1. general approach: in daily life, we rely on logic/intuition/what’s accessible to us

-can’t be based on a single case: could be the exception


Tools: need to be reliable and applicable

terms

Reliability: it always gives the same results -->the watch could be fast, but it is always fast by the same speed.

-->the ruler isn’t reliable since the temperature always changes its length a bit.

Hypothesis: a temporary assumption of explanation of a thing/relation b/w thing.


-in daily life, we tent to be ‘large’: general --> we like to generalize

-->let in science, we’re skeptical

Scientific Daily

General approach

Empiric Intuitional
Observation Critical/methodological Random/non-critical
Report Objective Subjective/colored
Terms Well defined terms/univocal terms Vague/ multi-meaning
Tools/measurements Non-reliable/inapplicable
Hypothesis Not necessarily testable
Attitude Skeptical accepting

Charles Spears: (philosopher)

4 sources of knowledge:

  1. level of tenacity (intuition)
  2. method of authority
  3. logic/honest mind
  4. scientific study


-scientific truth: probability… doesn’t have to happen: it is just probable it would happen.

-->Tylenol doesn’t always help

4 goal of science

  1. Describing
  2. To predict
  3. To understand
  4. Control


Term

Details

Description

Needs to be a full description! Not a vague, short description. no half-truths (it is not better [but not worse])

Predict

I can predict w/o understanding the intrinsic details
understanding To give reasons

-so that we can perhaps have alternative, parallel cures

Control

We control how the participant behaves in the person


Tools of the scientific method





Measures:

-in psychology, the measures aren’t as clear as other sciences

->measures don’t exactly measure what we mean

-psychologists try to measure how accurate their terms are

-it doesn’t really matter how people use the term as long as they define it properly – so that at least everyone knows what they mean when they use the terms

Experimentation

-we try to experiment

-in order to run an experiment and prove something, we need to have a situation





--

-science: tries to look for general rules

Experimentation

Labs – help isolate individuality causes/effects (causality)

àmore possible factors - less certain what the real answer is

things that give experimentation their strength

  1. we control which order
  2. we control the size of items
  3. we can repeat it
  4. randomness

Good thinking

-hard to define

-has to have a logic

àwon’t be infl. by things like out biases/thoughts/etc…

Law of Parsimony

-Shortest/most inclusive rule

Open mindedness

-you take into consideration unexpected things.

-gov’t involvement – usually hinders the science

Replication

-that is why you need to write the process

-in psych – hard to replicate some things: like studying dreams – can’t necessarily repeat dreams

Ethics

Basic rule the person wont be hurt physically/emotionally in any way

àespecially when the damage is irreversible



Example:


APA’s standards: you’ve got to weigh the benefits vs. the ethical expense

àethical boards


-ethics also apply to experiments on animals

Class, Dec. 24, 2001

theories/hypothesis

-the more cases/issues/variance that are explained by the theory – more inclusive – it is a better theory

Theory: a a system of principles/assumptions that explains/predicts b/h.

-explains relationships b/w things

-->an organizing scheme.-->organizational schemes

-->a point of origin for future research

-info are just building blocks. The theory arranges them into a structure

-internal vs. external locus of control:

-Rotter: you can divide people,e ito 1 of 2 groups:



-when 1 is promoted to a managerial job from the group – there is a feeling of inequity, yet if a manager is brought from the outside -->higher motivation to listen to him

Theory develops, and exchanges others

Norman triplet

If a person does something which another person does as well, I do it better/faster:

-->coaction: almost instinctual tendency to b/c aroused and do it faster

-->this instinctual energy is called dynamogenesis

Psychologists don’t like to speak of instincts:

  1. there are things which we used to think are instinctual, which turned out to be a learned thing
  2. anthropologists found things which are seemingly ‘instinctual’ yet diff b/w tribes
  1. humans are beyond animals/instincts
  2. I don’t explain it when I define it as an instinct -->it just labels the act

Floyd Olport

(cousin of the one who defines milgram’s experiment as the eichman experiment, -b/w they blamed everyone else)

-->changed name ‘coaction’ to social facilitation

-change of name = usually shows a change in direction

F. Olport: gets people to do simple vs. hard tasks (i.e. math/puzzles/etc.

-->social facilitation helps when simple task/distracts when task is hard

-->vs. triplett, who thought that coaction only delps

Zajonc:

-the mere presence of others – not only of they’re doing the same act

-->also happens to animals as simple as ants!

Zajanc (psychologist) notes:

-highly practical response/instinctive =facilitated in presence of coaction/audience

-newly learnt responds/actions =impaired by presence of coaction

-everything has a hierarchy of responses

2 stepped-study:

-people had to read sounds w/o meanings, each w/ various rated of repetition (of it returning)

-->the more it returned, the more dominant response

-then they were told that they were shown flashes of those nonsensical words (yet in reality, they were just show flashes of scribbles)

-->they answered that they were shown flashes of the repeating words more

-->but the curve was steeper w/ social facilitation

-Zajanc: only people who could appreciate our actions (as opposed to people who can’t, like sleeping people)

Study:

Scenarios:


Results: order of social facilitation, from highest to lowest:


-Conclusion: the mere knowledge of being watched, w/o being watched almost has the same effect

Study

Duval/wickland

Objective self-awareness: it is an issue of self-awareness – not others’ evaluations

-->so they try to bridge the gap b/w reality and ideal, or escape the gap

[subjective self-awareness: how I look at someone else.]

Study

-waiting for results for a test:


-->evaluation of oneself

-->development of theory of how we change our actions

-->after all, triplet’s theory can explain cases like video or mirrors

Class, Jan 7th, 2001

Operational definition

-each paper has to define its terms, since each person has diff. usage of the terms

àlike a recipe – has to tell what exactly the researcher meant

i.e. Anxiety is unclear unless you clearly state what is meant meaning of the word

Validity

Interval vs. external validity

Internal validity

Internal validity- to what degree can we claim that the change is b/c of the indp. Variable and not b/c of other reasons

àin other words, the researcher had control over the results

-as opposed to having rival/alternate explanations

3 criteria for internal validity

  1. Control group
  1. Extraneous variables: variables that effected our results w/o our intention
  2. Confounding variables: variables that disturb the result: can’t know whether it or the indp. Variable brought the result.

-I need at least 2 levels in the indp. Variable to make an experiment

àone is given to the control group/condition

àthe other is given to the experiment group/condition

External validity

Can It be generalized to other cases?

Jan 14, 2001

The ideal is to test ALL the populations.



Table of random numbers: a chart made by a computer where all the # are random

-the external validity is hurt if you didn’t choose random

Important:

Example - Brady

Question

-Why do managers dev. more ulcers than others

Answer

-studies on monkeys – 1 is a leader monkey -->has to press a lever in order so shock qll monkeys

-->manager monkies had more ulcers than regular

-but then people started realizing than that Brady didn’t have randomness: faster learners were managers/slower learners were ‘other’ monkeys

-->NO RANDOMNESS

Size of sample

-the bigger the # - the more reliability the results are.

-the weaker the strength of diff. (how diff. the 2 variables are), harder the tell diff -->need more #

-never less than 10 participants

The experimental design

=the structure of the experiment

-there are many diff. kinds of structures


The differences lie in:


Example

Good vs. bad news

Between subjects design

-diff participants in each of the categories

Within subject design

-the participants participate in some/all of the experimental situations

-->also called repeated measures design

Mixed design

-one variable is between subject design and another variable is within subject design

between subject design:

-simplest experimentation: 2-group-design



Matched groups

-only do hat’ama when the literature said that there is a involving varialbe

-making pairs of things that are similar àthen putting them on opposite groups (control/experimentation), based on say: a toss of coin

àcan do this b/f or after the setting up of the experimentation


1) Multiple group design

-some groups

F-test – to see if there is a diff. b/w individual diff and diff. b/w it and the average

àlook up chart in stats class of last year

-if calculated F is:

-deal 2/ charts of h.m. people are in the experiment

2)Factorial design

Source SS DF MS F
A
B
AxB
Within àthe df of all the cells
Total

example

Source SS DF MS(variance) (ss/DF) F (ms/wi)
A 32 1 32 6.56
B 8 1 8 1.71
AxB 72 1 72 15.45
Within 14 3 4.67
Total 126 7

Report: F(1,3) = 15,45, p<0.05

--

factor = same thing as independent variables in factor analysis

Example 2x4 experiment

B1 B2 B3 B4
A1 12 12 12 12
A2 12 12 12 12

àsince there are 2 levels of the independent variable: 1 df

à3 levels of B = 3 df

Source SS DF MS(variance) (ss/DF) F (ms/wi) Minimum f
A 1 3.96
B 3 2.72
AxB 3 2.72
Within 88
Total 95


Questions to do in an experiment

  1. h.m. indp. variables are there
  2. how many factors
  3. h.m. participated in the experience
  4. fill in the df
  5. find the minimum F that the experiment needs in order to all the cells to be significant

Reporting formula:

(AxB,WI) = F P<sig. level


example – a researcher reports that in a 2-fsactorial experiment, there is a sig. Diff.

f(2,54) = 11.8 P<05)

Source SS DF MS(variance) (ss/DF) F (ms/wi) Minimum f
A 2
B 1
AxB 2
Within 54
Total à60

answers

3x2

6 cells

3 factors

à10 in each group

Source SS DF MS(variance) (ss/DF) F (ms/wi) Minimum f
A 2, 1
B 2, 4
AxB 4
Within 90
Total 99 or 98

100 ot 99 participated

2X5

or

3 X 3

# of groups:

9 or 10

facrtors 2 4 or 5

àimportant: there is no correct answer since there are alternative answers

--

from here on, unless noted: 2X2

example

B1 B2
A1 8.0 8.5 8.25
A2 1.5 1.0 1.25
4.75 4.75

-we decide that any /x bigger than 1.5 needs to be clarified

Main effect


Interaction

Subtract A1B1 from A1B2 and A2B1 from A2B2 and see if the diff. b/w the 2 #’s is more than 1.5 – if yes àinteraction. 8-8.5 =-0.5 and 1.5-1 = 0.5 = diff = 1

èno interaction

-the question is is AXB sig? Is /a sig diff from /B

àno – in each case (A1-A2 / B1-B2), the diff. b/w them is smaller than 1.5)

**-w/ graphs thynggy

To find on a line graph:


Important: if 1 is significant – than it is all significant

In other words:

A =the diff b/w A and B àsig. If their diff. are sig.

B = the diff b/w averages of all of (interaction)

àif the diff of averages stay constant = no interaction

** all of the above

-to find the source of interaction: see if A rows is sig or B columns is sig

--

** Graphs/DF

March 11 2002

B1 B2 B3 Average
A1 2 2 4 2.66 2.66-6.66 >1.5

àsignificant

A2 6 6 8 6.66
Averageà 4 4 6

àwe know how to compare 2, but in cases of more than 2, we pair 2 at a time of the many variables (B1B2, B1B3, B2B3) as long as 2 of the many variables are significant, then we say it is sig.

--

within subjects design

-until now, we spoke of b/w subject design – how diff people react in diff situation

ànow, we’re speaking about the same person in 2 diff cases.

Benefits of w/I subject design:

  1. you reduce diff. in groups b/w of individual diff., since it is the same people on both groups
  2. you can adapt test to individuals (i.e. political people he likes)

  1. less people needed in the experiment àresources
  2. time saved teaching participant what to do in the experiment
  3. sensitivity to W/i group error is lower since it is same people
  4. we watch the same person over time and now diff. people at diff. times

Downsides of w/I subject design

  1. a person has to be in a lab for a long time àperson mightget tired and react diff. to experiment
  2. sometimes, longitude studies are too long-timed to be done (40 yrs. On the same person
  3. sometimes you have alternative answers: i.e. arts who’s brain areas have been removed and are violent – perhaps[s it is b/c of the medicine that puts them to sleep) àso you put a group w/o brain removal to sleep and another group that also the brain removed
  4. carry-over effect: when your reaction to something is infl. by something you did b/f. Therefore, you could be infl. by the previous part of the experiment!!!


**\\>

Small N design

-kind of w/I subject design.

ABBA design

Where you reverse the order of the manipulation to see the effects.

àcan also be don on a group as a unit

Example: behavior modification: teaching person new ways of b/h as alternative to maladaptive b/h. in those cases, you don’t want to return to the maladaptive situation àit is not ethical

questions

-table of random numbers -->what is the story here?

March 18, 2002

-external control, i.e. dealing w/ things like social factors.


Physical factors:


Resolution:

  1. Try to negate the distraction
  2. Balancing: run the experiment in 2 different environments [splitting between the 2 situations randomly] and see if there if a diff.

Within-subject experiment: has a unique problem: order effect

-Progressive error: same thing as order effect- just with more than 2 stimuli

Explanation of formula

A B L C L-1 D L-2 E

Take A, B, Last, C, Last –1, D, Last – 2, E etc….

-then fill out downwards

example of formula

-in case of A, B, C, D, E, F

First:

A B F (last) C E (Last-2) D

ABFCED

BCADFE

CDBEAF

DECFBA

EFDACB

FAEBDC

-in Latin square, you really need double as many as letters, when the number of stimuli is uneven

example (of 5)

ABECD

BCADE

CDBEA

DECAB

EADBC

-now reverse the orders:

DCEBA

EDACB

AEBDC

BACAD

CBDAE

àthose are the 10 orders really necessary

control over miscellaneous variables:

i.e. personality/causal

-infl. internal and sometimes external validity

Personality variables

-there is a response style: answer consistent answers (i.e. ‘yes’) regardless of the content of the question

MMRI= tests personality things like OCD/depressions/mania/etc.

Response set: tendency to answer in a way which one ought to answer and not truthfully, in order to give a certain impression

Crown/Harlow: social desirability scale:

Social desirability scale: if the person lies on the tests

Example of questions:

-do I get sometimes get angry

-I know my voted party’s statement of mission

-->high score =high set-response

solution

-you can phrase question in a way which the people don’t know how to answer.

-->i.e. ‘pets are fun to raise but some people don’t enjoy it’

situational/social factors

Demand characteristics: just like in the response set: people wonder what the experiment is really about and therefore try to realize the experimenter’s hypothesis (they don’t want to screw it since they want to look normal and not crazy)

-->no external validity, since they don’t act in a spontaneous way

-->studied by Orne

way to deal w/ demand characteristics:

Single blind experiment

-experiments, especially in medicine, where all the info is given, except he doesn’t know what situation he is in.

cover story:

-we distract the person from real issues of the experiment

accident

-make them think that what is happening is an accident, but it is really the experiment’s issue

example

-darlie/natane:


-see how long it takes them to leave (experiment helpers were supposed to stay)/report

Accomplice/confederate

-accomplice doing the important thing, while the experimenter just doing the measuring. The participant just doesn’t know that the experiment is really done mostly by the person who is really an accomplice.

Example:

-while waiting for experiment to start, the accomplice nods at a certain word. Then, when the participant enters the experiment, he is interviewed, where the measurement is really how many times he said that word after being conditioned earlier (by the accomplice)

Detachment between the experiment and the measurement

-i.e. to check cognitive dissonance, see if the attitude change is there the day after when participant doesn’t know that he’s really asked about the thing he took place earlier (i.e. day b/f)

Experimental bias: the resulting bias made by the experimenter: even the instructions have an effect on the participant, even though what he said never happens

chapter 4 of book

4 categories of variables

  • situational variables: situation/environment, i.e. density of class
  • Response variables: the b/h responses of individuals, i.e. response time
  • individual difference variables: the characteristics of individual, i.e. gender
  • Mediating variables: the psychological processes that mediate the situational variable on the particular response

Operational definition: how the researcher defines the variable, in order to use it/manipulate it.

Positive linear relationship: positive correlation

Negative linear relationship: negative correlation

Curvilinear relationship: the correlation changes: like a u or an upside down shaped u.

-->also called non-monotonic function

Non-experimental methods: observations/measures of variables

-->can’t assume causality -->there might be a cause not observable!

àthird variable problem: there might be a link b/w the variables, but there might a third variable causing both variables

àalternative explanations of the link b/w the dependant/independent variables

àfewer third variables: causal relationship b/w variables is stronger

  • Field experiment: independent variable is manipulated in a natural setting.

àdisadvantage: less control of all factors

  • ex post facto: assignment of groups based on a thing that happen to them (i.e. divorce)

    àthis is only in a non-experimental method

  • Participant variables: (also called subject variable) variables relating to subjects

    àin a non-experimental case

Experimental method: manipulation of variables is a controlled setting à: 1 variable is manipulated and then the other is measured

  • Experimental method: all extraneous variables are kept constant, therefore ensuring that only the manipulated variable is responsible for the resulting change
  • Randomization: when there is a variable that is not controllable (i.e. income) then the ranges are randomly placed in the various experiment groups

Independent variable: causing the dependant variable’s change(?)

àalways on horizontal axis

dependant variable: always on vertical axis.

-in some cases, where studies future predictions of b/h, no cause-and-effect is necessary to be studied

correlation coefficient: how strongly 2 variables are linked

Chapter 5

Validity

Validity: the truth of the experiment

  • Construct validity: adequacy of the operational definition of the variable. Does is actually define the term, or does it really define something else?

  • Internal validity: can we draw conclusions from our data? Are we sure that in reality, 1 variable really infl. the other, as we said? If yes =high internal validity.

    àCause-and-effect correctly used

  • External validity: can results be generalized to all cases? Will the results be the same w/ other participants/other situations?

Reliability: consistency/stability of a measure. I.e. if the measurement is similar every time it is measured.

2 components of measure:

  1. true measure: the real score on the variable
  2. measurement error: unreliable measure of a variable (i.e. unreliable measure of intelligence) has high measurement error

-lower reliability: higher variability àcould still be normal distribution with high variability.

Pearson product-moment correlation coefficient: Used to calculate correlation coefficient: (how strongly 2 variables are linked). Measured from –1.00 to +1.00. 0 correlation = no relations. +1.00 = positive relationship and –1.00 = negative relationship.

  • plus sign = when 1 variable is high, then the other variable is also high
  • minus sign – when 1 variable is low. Then the other one is high

test-retest reliability: 2 diff. tests on the same person at diff. times

-should be used for measures expected to be consistent over time (i.e. intelligence)

-for most measures, in order to be reliable, got to be over .80

in cases where things change, i.e. mood, one must use other methods to find reliability, using a single test

internal consistency reliability: many questions really checking 1 thing – the reliability is acquired through the sum of all the answers

split-half reliability: comparing 1 half of the test to the other half of the items. The items are split randomly.

Cronbach’s Alpha: each item is correlated to each other item. An average of all those correlations are taken.

Item-total: all the aforementioned ways to compare the item to the total. Allows you to eliminate an item which doesn’t quite correlate w/ other items

Interrater reliability: in cases of observations where you have a rater, 1 rater might be unreliable, so you have several raters and see the reliability of the raters to each other.

Construct validity of measures:

Construct validity: how ‘true’ the operational definition is.

àusually built up over many studies.

àwith time, measures are changed, to fix problems w/I them

  • face validity: does the measure at face value appear to measure what it is supposed to.

    àperson estimates whether the given question really measures the variable

    àProblems: 1)no an empirical way 2) many things measured don’t have an obvious face validity

Convergent / discriminant validity

Convergent validity: when variable finally proven to be related to other variables. I.e. self-esteem and well-being

Discriminant validity: variables known not to be related to each other.

Criterion validity

Criterion validity: measure of construct validity of measures which predict future b/h (i.e. SAT) –see if test (predictor variable) is related to future b/h. if criterion validity is high, then the test predicts well future b/h

Predictor variable: the test/measure which sets to predict future b/h

Criterion variable: the future b/h being predicted.

Reactivity of measures

Reactivity: if results might change if participant knows he’s being measured.

Unobtrusive/non-reactive: measuring variables indirectly, i.e. in seeing near which paintings in museum the tiles are most worn – you’d know that they’re most popular painting.

Variables

  • Nominal: a kind of variable which merely categorizes: i.e. gender. No meaning to order or # value
  • Ordinal: a kind of variable indicating order: i.e. ranks in the army
  • Interval: a kind of variable where there is a significance to the interval – the diff. b/w the numbers. I.e. the diff b/w 5 and 6, and 8 and 9 is the same 1 of a diff., though there is no meaning to an absolute 0. i.e. there is no absolute 0 in Celsius: 0 is not the absence of temperature.
  • Ratio: intervals with significance of the interval and there is an absolute 0! Examples include time and weight.

April 22, 2002

Experimenter bias: diff. terms for the same thing: way the experimenter b/h in order to effect the results

Also called:


àeven on animals!

Examples

Double blind effect: both the participant and the experimenter don’t know which of the experimentation settings the participant is in

Quasi-experimental design

-researches similar to experiments

-natural setting experiments are used to:

  1. isolate external validity to the lab experiment
  2. improve the conditions/to get better results
  1. things that you can’t do in a lab


Differences b/w lab and natural settings:


hardships in field research



Threats: things that might weaken the validity of the experiment

Possible threats:

  1. History: influences other than the ones that we thought/wanted to measure, that could possibly infl. the dependant variables.
  1. Maturation: changes w/i our participants
  1. testing: weakness when we measure the thing twice and the 1st testing might infl the 2nd testing:
  2. instrumentation: the tools/machines got worse

difference:

-testing is change in the person. Instrumentation is change in the tools (interviewer could also be a tool)

  1. statistical regression (regression = mean): tendency of experimental trial results to move towards the average: tall/short parents: kids =closer to average

scatter diagram

Pretest scores↓ 7 8 9 10 11 12 13 Avearge
13 Your browser may not support display of this image. 1 1 1 1 11.5
12 1 1 2 1 1 11
11 Your browser may not support display of this image. 1 2 3 3 2 1 10.5
10 1 1 3 4 3 1 1 10
9 1 2 3 3 3 1 9.5
8 1 1 2 1 1 9
7 1 1 1 8.5
Mean pretest 8.5 9 9.5 10 10.5 11 11.5

-regression line (line that goes through the most predictive scores)



Pretest mean Mean post test
13 11.5
12 11
11 10.5
10 10
9 9.5
8 9
7 8.5

àshows the people moving to the center (average)

  1. selection: perhaps the results are b/c of the participants chosen (lack of randomness) àThey were prone to those results from b/f the experiment! àmanipulation didn’t work that the groups are diff. from the beginning (chose a specific course/class àspecific groups would go to that course)
  2. mortality (attrition) –people leave the experiment
  3. interaction: interaction b/w the diff. threats. For example: the manipulation only worked on a certain group àonly worked by it was a special group.

àinteraction b/w manipulation and maturity

3 kinds of experiments

  1. experimental
  2. quasi-experimental


Bad types of experimental setup

1 shot case study

XO (x = manipulate/O =measurement

àwe measure 1 group -after manipulation, we measure its effect

problems:


1 group pretest-posttest design

O1XO2

Problems

  1. History: other factors, other than the experiment infl.
  2. Testing
  3. Instrumentations -->whenever there is a problem w/ testing, there usually is a problem w/ testing
  4. Regression: you don’t now whether it is extreme
  5. Maturation:
  6. Interaction: still can infl. only this group w/b they are unique.

**

Static group comparisons

-I have 2 groups:

-XO and O

-->1 group w/ manipulation and 1 w/o

-->not random (i.e. 1 underwent the course and 1 didn’t)

Problems:




îç÷řéí ăîĺéé đéńĺé, çĺěůä ůě äđéńĺé - äîůę

5) Statistical regression – řâřńéä ěîîĺöň, đčééú ä÷öĺĺú ěäâéň ěîîĺöň.

îîĺöň âĺáä äéěăéí ěäĺřéí âáĺäéí îŕĺă éäéĺ đîĺëéí îîîĺöň ääĺřéí, ĺîîĺöň âĺáä äéěăéí ěäĺřéí îŕĺă đîĺëéí éäéĺ âáĺäéí îîîĺöň ääĺřéí. ěîä ćĺ çĺěůä ? äŕí äůéôĺř äĺŕ úĺöŕä ůě ŕéîĺď çĺćř ŕĺ ň÷á řâřńéä ěîîĺöň, ÷řé äđčééä ůě áňěé öéĺď đîĺę ěůôř áđéńéĺď đĺńó ŕú öéĺđéäí.

äđçä ůě řňéĺď äřâřńéä, äéŕ ůéů ěëě îáçď ůâéŕä, ëůá÷öĺĺú äůâéŕĺú äď éĺúř âăĺěĺú. áŕîöň ääúôěâĺú äůâéŕä éëĺěä ěäéĺú âí ěëéĺĺď äâáĺä ĺâí ěëéĺĺď äđîĺę ĺěëď đůŕřéí ôçĺú ŕĺ éĺúř áŕîöň. ěđîĺëéí éů ůâéŕĺú ůđĺčĺú ěîňěä ĺěâáĺäéí äôĺę.

čáěä ăĺ ëđéńúéú Scatter gram – ôéćĺř äöéĺď ěôé îăéăä řŕůĺđä ĺůđééä

Mean posttest Posttest score Pretest score
13 12 11 10 9 8 7
11.5 1 1 1 1 Your browser may not support display of this image. 13
11 1 1 2 1 1 Your browser may not support display of this image. 12
10.5 1 2 3 3 2 1 11
10 1 1 3 4 3 1 1 10
9.5 1 2 3 3 2 1 9
9 1 1 2 1 1 8
8.5 1 1 1 1 7
11.5 11 10.5 10 9.5 9 8.5 Mean pretest

ä÷ĺ äĺŕ ÷ĺ äřâřńéä, äĺŕ ä÷ĺ ŕĺúĺ ŕđé îđáŕ. ŕĺúĺ ÷ĺ ůîńáéáĺ ńëĺí äůâéŕĺú äĺŕ ä÷čď áéĺúř.

posttest pretest posttest pretest
13 Your browser may not support display of this image. Your browser may not support display of this image. 13
12 Your browser may not support display of this image. Your browser may not support display of this image. 12
11 Your browser may not support display of this image. Your browser may not support display of this image. 11
10 Your browser may not support display of this image. Your browser may not support display of this image. 10
9 Your browser may not support display of this image.Your browser may not support display of this image.Your browser may not support display of this image. Your browser may not support display of this image.Your browser may not support display of this image.Your browser may not support display of this image. 9
8 8
7 7

řâřńéä ěîîĺöň – äúĺôňä ůě äđčééä ěäúëđń ěîîĺöň.

ä÷ĺ ůě ä÷ĺřěöéä äĺŕ äîîĺöň ůě ůđé ÷ĺĺé äřâřńéä.



äîůę äŕéĺîéí / çĺěůĺú äîç÷ř :

6. áçéřä Selection

äúĺöŕĺú ůŕđĺ î÷áěéí ŕéđí áâěě äúôňĺě ŕěŕ ň÷á ůäŕđůéí ěôđé äúôňĺě ëář äéĺ ëŕěä.

7. đůéřä Mortality (attrition)

đůéřä ůě đáă÷éí. ńĺâ îńĺéí ůě đáă÷éí đůř ĺáâěě ćä ÷áěúé äáăě áúĺöŕĺú ĺěŕ ÷ůĺř áîä ůçůáúé.

8. ŕéđčřŕ÷öéä Interaction

ŕéđčřŕ÷öéä áéď äŕéĺîéí äůĺđéí. éëĺě ěäéĺú ůéů ńě÷öéä ůě đáă÷éí, äúôňĺě ôňě ŕę ř÷ ňě ÷áĺöä ńě÷čéáéú, ňě ŕçřú ćä ěŕ äéä ôĺňě. úëđéú ěéîĺăéí ěëúä ä ĺ-ć', äůôňä ůĺđä ňě ëě ëéúä- éëĺě ěäéĺú ůćä áâěě ůéů áâřĺú âăĺěä éĺúř áëéúä ć, ëę ůäčéôĺě îůôéň ŕçřú ňě ä÷áĺöĺú âéě äůĺđĺú.

îňřëé îç÷ř ářîĺú ůĺđĺú

1. äřîä äřŕůĺđä - Preexperimental design

ŕěĺ äď řîĺú đîĺëĺú ůě îç÷ř ëîĺ :

ŕ. One shot case study (XO). XO – X ä÷ĺřń, O ńĺěí äňîăĺú ůňůéúé ěŕçř îëď. îç÷ř áĺ ÷áĺöä ŕçú ůŕçřé äúôňĺě áĺă÷éí äůôňä ŕçú ůŕđĺ çĺůáéí ůäúôňĺě âří ěä . ěăĺâîŕ- äŕí ä÷ĺřń áéäăĺú äáéŕ ěŕçĺć âáĺä ůě çéěĺđééí áňěé ăňĺú çéĺáéĺú ňě ăúééí ? ëëě ůěîç÷ř éĺúř çĺěůĺú ëę äĺŕ éĺúř âřĺň. ěŕ îúâář ňě îĺůâ ääéńčĺřéä (îůäĺ ŕçř îäúôňĺě âří ěúĺöŕĺú), ěŕ îúâář ňě äáůěĺú, ěŕ ňě äîăéăä, Instrumentation ćä áňéä ůě ůúé îăéăĺú ĺëŕď ŕéď ěé ůúé îăéăĺú, řâřńéä ěŕ řěĺĺđčéú, ńě÷öéä ŕĺîřú ëé îěëúçéěä ěńčĺăđčéí äéĺ ăňĺú çéĺáéĺú ĺěëď äĺŕ ěŕ îúâář ňě ńě÷öéä, đůéřä- ëě îé ůěŕ ŕäá ăúééí ňćá ŕú ä÷ĺřń ĺěëď ŕéđđĺ îúâář, ŕéđčřŕ÷öéä äéŕ äřňéĺď, ůéëĺě ěäéĺú ůä÷ĺřń äůôéň ę ř÷ ňě îé ůáçř áĺ ĺěŕ ňě ëě äŕđůéí áöĺřä ăĺîä.

á. One group pretest post test design (O1XO2) äĺńôúé îăéăä ěôđé ä÷ĺřń. ěŕ îúâář ňě äéńčĺřéä, áůěĺú, îăéăä éů ěé 2 îăéăĺú ĺěëď ééúëď ůäáéđĺ ŕú äůŕěĺď, äëěé – äîřŕééď îëéř ŕú äńčĺăđč ĺěëď îăář ŕěéäí ŕçřú ĺäí éůúôřĺ. řâřńéä- ŕé ŕôůř ěăňú ŕí äňîăĺú ÷éöĺđéĺú. ńě÷öéä- îúâář ňě ćä îëéĺĺď ůŕđé éĺăň ŕéę äđáă÷ äéä ěôđé ćä. đůéřä Mortality- îúâář ňě ćä, ëé ŕôůř ăřę ůúé îăéăĺú ěăňú ŕí đůřĺ ŕĺ ěŕ ĺěŕ ëîĺ áîăéăä ŕçú ůěŕ éĺăňéí ŕć ŕí đůřĺ ŕĺ ěŕ. ěŕ îúâář ňě ŕéđčřŕ÷öéä- ëé âí ôä äúôňĺě éëĺě ěäůôéň ňě ä÷áĺöä äćĺ ůîěëúçéěä äâéňä ě÷ĺřń áâěě ňîăĺú ŕĺ đčéĺú îńĺéîĺú.

â. Static group comparison (XO äçĺ÷ř ěŕ ńéĺĺâ ŕú äđáă÷éí

( O

ě÷áĺöĺú äůĺđĺú ŕěŕ ÷éáě ŕĺúí ëîĺ ůäí. ŕđé îůĺĺä ůúé ÷áĺöĺú, çéěĺđééí ůě÷çĺ ŕú ä÷ĺřń áéäăĺú ĺ÷áĺöú çéěĺđééí ůěŕ ěîăĺ ŕú ä÷ĺřń. ÷áĺöä ŕçú ňí úôňĺě ĺäůđééä ěěŕ úôňĺě. ŕđé îá÷ů ŕú ăňú äçéěĺđééí ěâáé ăúééí. îúâář ňě äéńčĺřéä, îúâář ňě áůěĺú, ěŕ řěĺĺđčé äîăéăä (ŕéď ůúé îăéăĺú) ëę ěâáé äëěé, řâřńéä, ěŕ îúâář ňě ńě÷öéä (ëé ŕéď ěé ňĺă îăéăä ěôđé), đůéřä, ŕĺ ŕéđčřŕ÷öéä.



ůîĺđä äçĺěůĺú

/ äîáçď

History áůěĺú Testing Instrumentation řâřńéä ńě÷öéä đůéřä ŕéđčřŕ÷öéä
One shot case study (XO). - - ě.ř ě.ř ě.ř - - -
One group pretest post test design - - - - ? + + +
Static group comparison - + ě.ř ě.ř ? - - -


2 experimental design

Solomon four-group design

R = random

4 patterns of manipulation/observation

R O1 X O3

O2 O4

X O5

O6

àyou see if there is

History Maturation Testing Instrumentation
If you have 2 measure-

ments:

-no prob, since you compared a O to an XO to see if it is the X that infl. or not

No prob, since you can’t say that 1 matured more than other groups no prob. since

1) comparing 1st and 2nd rows: see that if there is a diff: must be b/c of x and no other reasons

2)seeing diff. b/w

3/5 and 4/6 to see that there is diff->no infl of the testing

Instrumentation: something changes in measurement tools.

If you compare O2 and O4 ->if diff: prob. w/ instrumentation (but could also be tools)

Regression Selection Mortality Interaction
-regression: I tested extreme groups: that had to even down to average:

-no prob.

1)you split groups randomly

2)if it was an extreme group, you would have diff. b/w O1 and O2

-no prob: I choose who gets the manipulation and who doesn’t. I don’t choose a scenario as a manipulation If I can show that no-one left

-->no prob.

-in case of 2 measurements –even w/ same group, I can see if people left .

No prob: I can show that there is no interaction b/w those who had manipulation (b/f or after) and those who didn’t (b/w or after)

àif yes interaction: prob.

-in order to say that X’s infl. on the later O is not only interaction w/ the first O, you got to have the same experiment w/o the former O and see if there is interaction b/w the 1st set of trials and the 2nd. If interaction àprob!!!


Solomon four-group design is really comparing 2 experiments

Pretest-post test control group design

R O1 X O3

O2 O4

History Maturation Testing Instrumentation
Once you have 2 measure-

ments:

-no prob, since you compared a O to an XO to see if it is the X that infl. or not

No prob, since no reason to say that 1 matured more than other groups

-once you have 2 groups, you have to say that the 2 groups are the same (i.e. randomness), and not b/c of specific maturation

No prob. since

1)comparing 1st and 2nd rows: see that if there is a diff: must be b/c of x and no other reasons

Instrumentation: something changes in measurement tools.

If you compare O2 and O4 ->if diff: prob. w/ instrumentation (but could also be tools)

Regression Selection Mortality Interaction
-regression: since results are diff. O3 and O4 à therefore you must assume that there is no regression ** -no prob if I compare O1 and O2 if they see that they are the same If I can show that no-one left

-->no prob.

-in case of 2 measurements –even w/ same group, I can see if people left

Problem: I do not have enough info to see if there is internal interaction àprob!!!

Randomized-two-group design

R X O5

O6

History Maturation Testing Instrumentation
+

No prob, since once I have 2 groups, I can see the diff. to see if there external variables influencing the results

+

No prob, since you there is no reason to say that 1 matured more than other groups

-once you have 2 groups, you have to say that the 2 groups are the same (i.e. randomness), and not b/c of specific maturation

?

Not relevant

-only 1 measurement

-->no change in the measure! (?)

?

Not relevant –ibid.

(?)

Regression Selection Mortality Interaction
(?)

I can’t prove that it is regression (despite randomization) b/c you don’t have initial group to see the diff. b/w them

(?) You don’t know the prior state, b/c you don’t have the 1st measurement (o) and therefore, you are left w/ a question-mark +

If I can show that no-one left

-->no prob.

-I know how many people came in, so I know how many people came (unlike questionnaires where I don’t know how many people refused to take the questionnaire once they saw the content of it.

-

Problem: I do not have enough info to see if there is internal interaction àprob!!!



--

Quasi-experimental design

-(interrupted) times-series design

imp:->NO RANDOMNESS IN Quasi-experimental design

-as seen above - if you want to reduce weakness --> add observations or measurements

-quasi-experimental design: some measurements beforehand/manipulation/a few measurements afterwards:

i.e.-O1, O2, O3, O4, XO5, O6, O7, O8

example:

-astronauts in space-walks =nervous. Therefore, more swearing. I.e. O1-4 X=space-walk [O5 measured during at x (space-walk)] and then O6-8

Your browser may not support display of this image.Your browser may not support display of this image.

Your browser may not support display of this image.Your browser may not support display of this image.

O1 O2 O3 O4 O5 O6 O7 O8


History Maturation Testing Instrumentation
-

b/c you can’t measure all factors. I.e. hw might have been upset at election results

+

-

+

if there is testing, it would have a certain 1-directional slope

Ibid.
Regression Selection Mortality Interaction
+

since there is no zigzags in the curve

+

selection: from the beginning the

since you compare diff. observations and see if

+ same people in all experiment -

Problem: I do not have enough info to see if there is internal interaction àprob!!!

-only 3 groups

-interaction b/w the situation


Variations of the times series response

Your browser may not support display of this image.Your browser may not support display of this image.àeffect is lasting (i.e. vaccination)

Your browser may not support display of this image.


àdelay b/w manipulation and reaction

Your browser may not support display of this image.Your browser may not support display of this image.

Your browser may not support display of this image.Your browser may not support display of this image.Your browser may not support display of this image.

Multiple time series design

-no randomization/no groups

-i.e. after a certain thing happens in 1 city, compare it to another place

ànot a random place, i.e. after a thing was screened in 1 city, but not in another ity, seeing the change in one city, and compare it to what happens in a city where the movie wasn’t screened

Group 1

O1 O2 O3 O4 O5 O6 O7 O8

O1 O2 O3 O4 O5 O6 O7 O8

Group 2

Multiple time series design:

History Maturation Testing Instrumentation
+

-if you compare O1 of each group and see that they are the same, then you see that there are no intervening variables (which ruins history)

+

-

+

if there is testing, it would have a certain 1-directional slope

Ibid.
Regression Selection Mortality Interaction
+

since there is no zigzags in the curve

+

selection: from the beginning the

since you compare diff. observations and see if

+ same people in all experiment -

I don’t have the info to see if there is interaction b/w place and manipulation. They could be more violent b/c of other things, such as terror, or other reasons.


àunlike time series –multiple time series can deal w/

regression discontinuity

-continuation of static group comparison




Cross-sectional study: comparison diff. groups at same time

Longitudinal study: take a group over a long time

Your browser may not support display of this image.



Your browser may not support display of this image.








Your browser may not support display of this image.

X1 X2 X3 O4 O5 O6

History Maturation Testing Instrumentation
+ once you have more than 1 group, you can compare b/w them – if they are the same in the beginning àno prob. +

no reason to assume that 1 group matures (learns) faster than another

_

-only 1 measure, therefore, can’t compare diff. in testing of 2 observations

Ibid.
Regression Selection Mortality Interaction
+

Usually, the regression line seems to get people closer to average, yet here, the weaker (extreme) groups aren’t infl. closer to the average, but rather, moving upwards

+

-we see if people left or not

-

no! it might only work on weaker students àselection



Pretest/posttest non-equivalent control group

O1 X O2

Your browser may not support display of this image.

O1 O2

àfor example 1st is frontal learning classroom. Other is jigsaw classroom

Your browser may not support display of this image.

Your browser may not support display of this image.Your browser may not support display of this image.



Your browser may not support display of this image.

History Maturation Testing Instrumentation
+ b/c both groups have all the same variables except the teaching styles. Therefore, there aren’t external variables +

no reason to assume that 1 group matures (learns) faster than another

+

-we don’t think that there is testing diff, b/c we see 2 diff groups reacting in diff. ways. Thus the test didn’t change w/ time, since you should have seen in equally in both groups. Since there is a diff. result slopes àmust be b/c of X

Ibid.
Regression Selection Mortality Interaction
Depends!

If 1 group crosses the consistent line of the other group àno regression, since you can’t regress beyond regression. If the 2nd groups slope comes close, but not over the horizontal slope àmight be regression

+

-you’re sure that you select a group prone to those results, since you measured the group b/f X to see if they were prone to the results before the C

+

-we see if people left or not

-

-you might have interaction b/c the manipulation and the specific teacher

àother teachers might teach better/worse in each manipulation types



Measurements

-Giving #’s systematically to phenomenon

-# have no meaning w/o definition of what each # on the scale means

-you have a set of # -Set A (say participants) A1, a2, a3, a4, a5

-you have a set B – gender (b1, b2) you classify A into set b = that is a kind of measurement

measurement = turning results into # that rank

àyou have to have isomorphism: connection/logic in the arrangement of the numbers:

àneed to have connection b/w the numbers and what you’re measuring

àif no consistency = lower isomorphism

for example: if you have test results, and you have same marks for several participants, you add up all the ranking # and divide by # of participants w/ than score


Real mark: the accurate ranking

Used mark: taking same results into account (as seen above)

--

90à1.5

90à1.5

70à6

80à3.5

80à3.5

70à6

70à6

70à6

70à6

60à10

Correlation: to check isomorphism

p =1- (6åD2/N3- N)

D2 = diff. b/w original mark and measurement-given #

àthe sum of all if diff. b/w original results and given

N = # of pairs

àthe answer of the formula: how good the correlation is

ranking/real ranking D2

1 1 0

2 2 0

4 3 1

5 4 1

6 5 1

8 6 4

1-7/216-6

=1-(1/30)

=1-0.0333333…

r=0.96666666…

Measurement Rules

  1. Isoid (equivalence) rule: A=B and A≠B are mutually exclusive
  1. If A=B and B=C then A=C i.e. the results of 1st place to 2nd place is not the same thing as the results from 2nd place to 3rd place.
  2. if A>B and B>C then A>C transitivity

scales/levels of measurements

  1. nominal: only rule #1 exists = a is diff. than b a=b OR a≠b

  1. Ordinal: not only quantity, but also an order. The number has no value, but the order does.
  1. Interval: I have no info regarding the relations: I don’t have the 0
  1. Ratio: I have all the aforementioned info, including the absolute 0, therefore, I have the exact relation b/w all the numbers used


May 27, 2002

-last class, we saw that there is a diff. b/w real measurement and wanted measurement

àeach term is measured/expressed/defined in diff. ways

examples:

-gender: what people are biologically/what they define themselves socially

-now that we decided how to measure a variable, we need to measure:

  1. Reliability: is the tool ok for measurement of what we want?
  2. Validity: is this a ok way to measure the specific variability


-reliability is a necessary but not sufficient for measurement.

àif it is not reliable, then it is not valid.

Reliability

Reliability: accuracy/consistency of the term/variable across situations


3 approaches to measure reliability

  1. if we measure the set of objects the second time (given they weren’t changed) àwill we get the same results
  2. is it a ‘true’ measurement?
  3. What is the sized of error of measurement.
    1. 2 kinds of errors.
      1. Systematic error: constantly making the same mistake: the watch is 1 hour lagging, each 24 hours àreliably, but no validity.
        1. Xt =Xµ + e
      1. Random error: inconsistent mistakes: lagging by various times in diff. cases. ànot reliable and validity


Sourse

SS Df MS

Item àignore
Individual 30.17 àVt

-Also V(ind)

Residual (also error) 7.14 = Ve


So plug into formula

=30.17-7.14/30.17



Note: in d, everyone is more reserved about answering: systematic error

Variance: ind’s diff. from average squared

s

2=

å

(Xi-/X)2

n

--

C=total squared –n

**get help




Subject


Subject Item A B c d åind
1 6 6 5 4 21
2 4 6 5 3 18
3 4 4 4 2 14
4 3 1 4 2 10
5 1 2 1 1 5
18 19 19 12 68


-I Can compare diff. parts of testing items to get variance

à

i.e. 1st half w/ 2nd half



à

split-half reliability: compare the odds w/ evens (might get tires, so no comparing 1st half w/ 2nd half


Subject Item A B c d åind åo(dd) åe(even
1 6 6 5 4 21 11 10
2 4 6 5 3 18 9 9
3 4 4 4 2 14 8 6
4 3 1 4 2 10 7 3
5 1 2 1 1 5 2 3
18 19 19 12 68

à

comparing the

å

o and

å

e gets reliability



Subject Item A B c d åind åo(dd) åe(even
1 6 6 5 4 21 11 10
2 4 6 5 3 18 9 9
3 4 4 4 2 14 8 6
4 3 1 4 2 10 7 3
5 1 2 1 1 5 2 3
18 19 19 12 68

P=1-(6

å

d2/n3-n)

N=# of pairs

Ro Re D D2
1 1 0 0
2 2 0 0
3 3 0 0
4 4.5 0.5 0.25
5 4.5 0.5 0.25
D2=0.5


--


Subject Item A B c d åind åo(dd) åe(even
1 6 4 5 1 16 11 5
2 4 1 5 4 14 9 5
3 4 6 4 2 16 8 8
4 3 6 4 3 16 7 9
5 1 2 1 2 6 2 4
13 19 19 12 68

P=1-(6

å

d2/n3-n)

N=# of pairs

Ro Re D D2
1 3.5 2.5 6.25
2 3.5 1.5 2.25
3 2 1 1
4 1 3 9
5 5 0 0
D2=18.5

1-6(18.5)/120


0.75


coefficient varient??**


r2 = the common part of both tests


-there is a rule in reliability: the more items, the more chance of being similar

à

but in splithalf reliability, we reduced the # of items

à

so there is a way to fix it: spearman-brown formula


spearman-brown formula:

Rn=NR/1+(n-1)r


R=wanted correlations (how much there really is)

N = wanted # of items (the # that really exists)/existing items in out split half test


Example: 10 items of half-split

Rn=20/10 x 0.6

1+(20/10-1) x 0.6



1.2/1.6


=6/8

=0.75


--

Ro Re D D2
1 1 0 0
2 2 0 0
3 3 0 0
4 4.5 0.5 0.25
5 4.5 0.5 0.25
D2=0.5

Question: why not compare all of them to each other?


N = n(n-1)/2


4 = 4(3)/2 = 12



chronbach alpha/ internal consistency: ranking each one againstg each other: a vs. b’s ranking a

à

b, c, d,, B vs. c, d.


->when using correlation as a score = correlation of a pair = 1 score. 2 pairs = 2 scores


-question on the test: where is there most reliability


chronbach alpha:

µ

=N

Ø

/ 1+(n-1)

Ø


n=# of TOTAL items

Ø

= the average of correlations of the N [= n(n-1)/2]




test-retest reliability: measured against all of the items

à

very similar in reliability to cronbach,

à

since it tests twice, and each item is considered, since Xt = X + e[rror] **


alternate forms reliability/parallel forms: when the same question is rephrased when retested


Locations of visitors to this page