Geoff Miller

Archive for June, 2010|Monthly archive page

High School Draft Picks: Chapter III, Creating Surveys

In First Year Player Draft Pick Research on June 21, 2010 at 11:23 am

In 1998, for my Master’s Thesis at San Diego State University, I chose to study the “Decision-Making Factors Governing High School Players’ Choice of a College or Professional Baseball Opportunity.” I wanted to know what factors were most important to high school seniors who were drafted and had to choose between signing or going to school as I had known many players who regretted their choices years after they made them.  I revisit my research and discuss my findings with friends and colleagues each year as the First Year Player Draft draws near.  Last week, as the Draft was taking place, I decided I would post my entire thesis in an effort to learn more from coaches, parents, and players who have recently been involved in this decision.  I’ll be posting a new chapter every few days and will also include pages and pages of subject answers to open-ended questions, which are very interesting and shed lots of light on this process.  I’m going to leave out the statistics, surveys, tables, and most appendices, but if you’d like a full electronic copy of my thesis, please just email me and I’ll be happy to send to you.

Warning…a few of these sections can be a bit dry, to say the least, but most of the reading is interesting stuff and I would be glad to discuss my past and current thoughts on the draft process either on the blog or offline.  And please keep in mind that these data are 12-13 years old, so some of the dollar amounts need to be taken in context.  I would encourage anyone and everyone who would like to offer feedback and stories so we can all learn more from each other. Chapter III includes the methodology I used to create reliable and valid surveys, then gather my data on professional and collegiate baseball players.

Click Here for Chapter I, Introduction

Click Here for Chapter II, Review of Literature




The purpose of this thesis was to determine factors considered by high school seniors when choosing between a college and professional baseball opportunity.  A secondary purpose was to compare recent college and professional baseball players on the discovered factors.

Determination of Decision Factors

A descriptive research design was employed to determine and analyze important aspects in the present situation (Best, 1959).  This design called for the construction of surveys (Bateson, 1984; Braverman, 1996; Carroll & Johnson, 1990).  Survey information from two groups – those who chose college and those who chose professional baseball, was analyzed to determine differences in selection factors between groups.

The independent variables were chosen because they grouped subjects into the only two categories for athletes continuing to play baseball after high school.  The initial content of surveys was based on information obtained from a literature search.

Item Pool

An item pool was developed based on the categories discovered through the literature review process.  A set of items for each category was included  (see Appendices A and B for surveys).  Signing bonus information was elicited by three items.  The first asked players how much money they were offered.  The next asked whether money for college was included in the offer.  The final item asked college players how much more money it would have taken for them to sign with the team that drafted them.  It asked professional players if they would have accepted any less money than their contracts stipulated.

Six items determined the players’ educational qualifications and intentions.  Individual items asked high school grade point averages, SAT and ACT scores, whether or not players looked forward to taking college classes while in high school, players’ career goals, and whether or not earning a degree was planned.

Signability was assessed by seven items.  First were a series of items devoted to the amateur draft: the round the player was drafted, whether he liked or disliked the team that drafted him, if he was drafted higher, lower, or where expected, and if he felt playing college baseball would improve his draft status.  A second series of items examined what kind of information was given to scouts: whether the player told scouts he would be easy to sign, hard to sign, and if he planned to attend college.

Six items were intended to discover the effects of social influences on players.  One of these items asked for the highest level of education achieved by either parent.  Three more items asked for the perceived influence of parents, coaches, and financial advisors on decisions.  Another item asked whether the player had any family members who had previously or were currently playing college or professional baseball.  A final item asked whether the player considered attending college for social opportunities associated with college life.

College influences were studied with the inclusion of six items.  Scholarship offers were deemed part of this category.  College players were asked whether they played any other sports at their universities and professional players if they considered playing other sports in college.  Three final items asked whether playing for Team USA, playing in a College World Series, and playing for a reputable coach or program were important.

Structure of Survey Items

Each item inquired about a specific possible decision-making selection factor and offered a response set from which each player selected the most appropriate answer.  Responses were created using a bipolar, multiple interval rating scale of semantic differentials (Schwartz & Sudman, 1996).  Schuman and Presser (1981) stated that closed-ended questions typically include five response alternatives.  The inclusion of too many alternatives produces “noise” making discrimination difficult, although more choices would seem to foster more accurate results (Fowler, 1993). It was decided that five responses would be provided for each item, with a “not applicable” choice included as well, to account for the possibility that some items would be inappropriate for some subjects (Rossi, Wright, & Anderson, 1983).

Each set of responses included an equal number of favorable and unfavorable descriptions and a middle neutral alternative (Best, 1959).  Fowler (1993) contended that people who responded higher on the continuum of ordinal responses could accurately be described as feeling stronger about a question, but problems of response interpretations might arise.  Bradburn and Sudman (1979) addressed the meaning of responses, reporting that respondents derive meaning from the relative position of adjectives in a group of response alternatives, not necessarily from the textbook definitions of the words.  In this survey the wording of alternatives was usually repetitious, with only the degree of severity being altered.  For example, a set of alternatives that used “definitely” and “possibly” as agreeable answers used “definitely not” and “possibly not” as disagreeable answers.

Two types of open-ended response items were designed: directed and inclusionary responses.  Directed open-ended responses requested specific information from subjects.  For example, college players were asked about their career goals upon leaving school while professionals were asked to list their career goals upon retiring from baseball.  Some of these response items followed a closed-ended response set.  Directed open-ended response items are denoted with plus signs (+) on each survey (see Appendices A and B).

Inclusionary open-ended response items can be found at the end of each survey subsection.  Subjects were asked to describe any additional factors related to their decisions in an effort to capture further information.  The last two items on the survey asked subjects to relate any factors not covered by any set of items and to list any regrets they had about the decision they made.  Each of those items was underlined on the surveys.

Validation of Items

Surveys were constructed using procedures to establish content validity.  A group of experts comprised of coaches and scouts (see Appendix C) evaluated items and responses on both professional and college player surveys as part of the content validity procedure (Henderson, 1989; Mussio & Smith, 1973).  Each member of the group had an extensive background in high school, college, and/or professional coaching or scouting.

The original surveys were amended based on the responses of the expert panel.  Panel members were instructed (see Appendix D) to read each item and the corresponding response set and circle “valid” if they felt the question was a valid one to ask recent draft picks or “invalid” if they did not.  If they could not assess the validity of the item or simply had no opinion, they were to circle “no opinion”.

To ensure that items had been thoughtfully considered, no more than six unanswered items would be accepted for any coach.  If more than six were unanswered, the coach was eliminated from the panel.  Items were retained if 20% or more of the sample deemed them valid and less than 20% deemed them invalid.  The 20% criterion was an arbitrary but conservative subjective decision on behalf of the investigator.

Reliability of Items

A total of 16 college players served as subjects for a test-retest reliability procedure.  These players were all surveyed in person on two separate occasions, at least 14 days apart.  These were the only subjects whose anonymity was risked during the collection of data.  To ensure that there would be no connection of names to answers, each survey was coded with an identification number that corresponded to each subject’s name.  The list of names was kept separate from test materials.  This code was then used to match the subjects’ second surveys to their first.  To ensure anonymity, identity markings were not requested of subjects who only took the survey once.  The cover letter describing all instructions to subjects is listed in Appendix E.

A criterion of .70 was deemed the minimum acceptable intraclass coefficient to achieve reliability.  A .80 criterion is widely used, but the reliability assessment in this case was determined on an item-by-item basis.  The reliability coefficient of a test is a function of the number of questions included in the test.  Therefore, a  .70 coefficient derived from an item-by-item is considered to be highly reliable, even though it is less than the .80 value.

A limitation to this procedure was that reliability was only assessed for the college survey.  It was assumed that, because of the similarity of the tools, the professional tool would be as reliable as the college tool.

College and Professional Forms of the Tool

Each survey form contained two items that were specifically designed to examine factors exclusive to each group choice and were not appropriate for use in subsequent comparisons or formal analysis.  Exclusive questions have been denoted on each survey (see Appendices A and B) by asterisks.

College and professional players were compared on all items that were common to both forms of the assessment tool.


Subjects for the study were 36 college (mean age = 19.68 years) and 32 professional (mean age = 19.77 years) baseball players.  Both groups were convenient volunteers.  Recruitment of subjects took place through personal communication with coaches, scouts, teammates, or friends.  In some cases, surveys were administered by mail to subjects through these personal confederates, but the majority were administered in person by the investigator.   Players from a number of professional organizations made up the pool of professional subjects.  A diverse sample of players from Division I universities in different but competitive conferences comprised the college players.  As a result of this diverse recruitment of subjects, it is believed that both groups approach representation of their respective populations.  Complete lists of the organizations and universities represented in the study are included in Appendix F.  The number of subjects recruited was assumed to be sufficient to achieve accurate results (Glass & Hopkins, 1984).

Sample Selection

For two reasons, it was essential that the athletes surveyed had been selected in amateur drafts in recent years. First, the more recent the decision had been made, the more the athlete would be able to remember about the decision (Schwartz & Reisberg, 1991).  Second, the magnitude of signing bonus offers increased markedly in the last decade (Shaiken, 1997), making the decision a completely different one for recent high school athletes than for high school athletes of 8-10 years ago.

A college athlete can take up to five years to use four seasons of eligibility.  Therefore, except for rare occasions, the population of current college baseball players finished its senior year of high school within the five years prior to the study.  All professional subjects were recruited with the stipulation that they had graduated from high school, and in turn, had been drafted within the last five years.

It was imperative that subjects from both groups had been faced with the choice of going to college or playing professional baseball.  Only college players who had been drafted in high school were selected to serve as subjects.  College players who had been drafted in junior college and college seniors who were drafted as college juniors were not eligible to participate in the study unless they had also been drafted in high school.  Similarly, it was important that all professional subjects had the necessary academic credentials to attend college.  It was not necessary for professional players to have been offered college scholarships in order to participate in the study.

Dependent Variables

The dependent variables in the study were the informational items collected in the surveys.  For purposes of data analysis, each item represented a different dependent variable.

Data Analyses

Test-retest reliability for each item was determined from an intraclass correlation coefficient (R) obtained via one-way repeated measures analysis of variance (ANOVA).

Chi-square analysis was used to examine differences between college and professional subjects on all questions that were common to both surveys.

If variables were found to discriminate between each group, they were further correlated with each other to assess whether they measured mostly the same information.  A Spearman rank-order correlation coefficient (rs) was used because of the ordinal nature of the data.  A practical level for “noteworthiness” of a correlation coefficient of rs=.71 was established.  The practicality of that level was that it indicated there was more (>50%) commonality between the variables than difference.  When two variables that discriminated were sufficiently related they would be discussed together because of their level of similarity.

If you would like to receive new posts from The Winning Mind in Baseball by email, please CLICK HERE.

Geoff Miller’s book, Intangibles: Big-League Stories and Strategies for Winning the Mental Game — in Baseball and in Life, was released in August, 2012. For more information and free sample chapters, please visit:

For more information, please contact Geoff Miller at

%d bloggers like this: