For the same variable, the methods can be different depending on the type of fishery. The sources fishers, processors etc. Buyers, processors and other intermediaries are likely to keep their own sales records, which should be used as the basis of data forms.
Small-scale fishers often do not keep any records, and data acquisition in this case would be restricted to one-to-one interviews, but the interview structure could be more flexible. Data collection should be conducted at intervals sufficiently frequent for the management purpose. For example, data for stock monitoring have to be collected constantly, while household data can be at much longer time intervals.
In general, frequently collected data will probably have to rely on fishers or industry personnel providing the data. Less frequent data can use enumerators since the costs of collection are much lower. There are cases when fishery data collection programmes cannot be operated on a regular basis because of operational limits. These cases include small scale fishing operations in many inland or remote marine areas, where fishing operations are spread over a large area with part-time fishers using a large array of fishing gears and techniques, sometimes in many different habitats.
Under these circumstances, a number of alternative approaches can be taken to assess the fisheries, including: All of these can be used for cross-checking landings data as well as providing production and socio-cultural information.
Many variables can be collected by more than one method and at different points from fishers to consumers. Where possible, data should be collected from several sources to crosscheck for errors. For example, catch data collected through logbooks can be cross-checked against reported landings based on sales slips, data collected by interview at landing sites and even consumer or trade data. In almost all cases, many different variables can be collected simultaneously.
For example, length frequency, species composition, average weight and first sale price can all be obtained when vessels land their catch. Collecting of data for different purposes reduces costs and thus due account should be made of this aspect when planning the data collection programme. There are strong links between types of data, where they can be obtained and the methods, which are available for their collection.
This section provides a guide for selecting data collection methods in relation to the data type and source, and gives some indication of what types of data can be collected simultaneously. The most direct approach to the fishery data e. This may include middle person, fish auction, cold storage, processing farms and transport of products. It may include the fish market at landing port, transaction secondary market of products among brokers, processing farms and consumers' market.
It would include various agencies outside of fisheries e. These tables intend to give some guidance for selecting collecting methods and sources, and design a data collection system. The tables also would give ideas about what types of data can be collected simultaneously at the same source with the same method.
Numbers in brackets refer to relevant sections in the main text. This will affect the method of collection, the design of the recording form and later analyses.
For example, catch can be recorded in 1, 10, , kg or other units. Total estimated catch can be disaggregated into species by relative proportions or each species mass can be estimated separately.
Fishers' age can be recorded by year categories or locally derived groups such as "apprentice", "active" or "semi-retired". However, there is little point in requesting a captain to report and record the catch from a haul to the nearest kilogram, when his estimates are only accurate to the nearest tonne. If more precise measurements are required, the catch will have to be weighed on landing.
Sometimes decisions on the units of measure are complicated by the type of data to be collected. Data values may need to be represented by codes e. It can be used to obtain complete enumeration through a legal requirement.
Registers are implemented when there is a need for accurate knowledge of the size and type of the fishing fleet and for closer monitoring of fishing activities to ensure compliance with fishery regulations. They may also incorporate information related to fiscal purposes e. Although registers are usually implemented for purposes other than to collect data, they can be very useful in the design and implementation of a statistical system, provided that the data they contain are reliable, timely and complete 6.
Data on vessel type, size, gear type, country of origin, fish holding capacity, number of fishers and engine horsepower should be made available for the registry.
Companies dealing with fisheries agencies are registered for various purposes. These companies may not only include fishing companies, but also other type of companies involved in processing and marketing fishery products. Data, such as the number of vessels, gear type and vessel size of registered fishing companies, should be recorded during such registration.
Processing companies should provide basic data on the type of processing, type of raw material, capacity of processing, and even the source of material. Fishing vessels and fishing gears may often be required to hold a valid fishing licence. Unlike vessel registers, licences tend to be issued for access to specific fisheries over a set period of time. Because licences may have to be periodically renewed, they can be a useful way to update information on vessel and gear characteristics. If licences must be renewed each year, data collected from licensing is particularly useful, as records are updated on an annual basis.
Registry data also contain criteria for the classification of fishing units into strata. These classifications are usually based on assumptions and a priori knowledge regarding differences on catch rates, species composition and species selectively. In general, vessel registers are complex systems requiring well-established administrative procedures supported by effective data communications, data storage and processing components.
As such, they predominantly deal with only certain types and size of fishing units, most often belonging to industrial and semi-industrial fleets. Small-scale and subsistence fisheries involving large numbers of fishing units are often not part of a register system or, if registered, are not easily traced so as to allow validation or updating.
Questionnaires can be handed out or sent by mail and later collected or returned by stamped addressed envelope. This method can be adopted for the entire population or sampled sectors. Questionnaires may be used to collect regular or infrequent routine data, and data for specialised studies. While the information in this section applies to questionnaires for all these uses, examples will concern only routine data, whether regular or infrequent.
Some of the data often obtained through questionnaires include demographic characteristics, fishing practices, opinions of stakeholders on fisheries issues or management, general information on fishers and household food budgets. A questionnaire requires respondents to fill out the form themselves, and so requires a high level of literacy. Where multiple languages are common, questionnaires should be prepared using the major languages of the target group.
Special care needs to be taken in these cases to ensure accurate translations. In order to maximise return rates, questionnaires should be designed to be as simple and clear as possible, with targeted sections and questions. Most importantly, questionnaires should also be as short as possible.
If the questionnaire is being given to a sample population, then it may be preferable to prepare several smaller, more targeted questionnaires, each provided to a sub-sample. If the questionnaire is used for a complete enumeration, then special care needs to be taken to avoid overburdening the respondent.
If, for instance, several agencies require the same data, attempts should be made to co-ordinate its collection to avoid duplication. The information that can be obtained through questionnaires consists of almost any data variable. For example, catch or landing information can be collected through questionnaire from fishers, market middle-persons, market sellers and buyers, processors etc. Likewise, socio-economic data can also be obtained through questionnaires from a variety of sources. However, in all cases variables obtained are an opinion and not a direct measurement, and so may be subject to serious errors.
Using direct observations 6. Questionnaires, like interviews, can contain either structured questions with blanks to be filled in, multiple choice questions, or they can contain open-ended questions where the respondent is encouraged to reply at length and choose their own focus to some extent.
To facilitate filling out forms and data entry in a structured format, the form should ideally be machine-readable, or at least laid out with data fields clearly identifiable and responses pre-coded. In general, writing should be reduced to a minimum e. In an open-ended format, keywords and other structuring procedures should be imposed later to facilitate database entry and analysis, if necessary.
Structured interviews are performed by using survey forms, whereas open interviews are notes taken while talking with respondents. The notes are subsequently structured interpreted for further analysis. As in preparing a questionnaire, it is important to pilot test forms designed for the interviews.
The best attempt to clarify and focus by the designer cannot anticipate all possible respondent interpretations.
A small-scale test prior to actual use for data collection will assure better data and avoid wasting time and money. Although structured interviews can be used to obtain almost any information, as with questionnaires, information is based on personal opinion. Data on variables such as catch or effort are potentially subject to large errors, due to poor estimates or intentional errors of sensitive information.
Focus groups are small individuals and composed of representative members of a group whose beliefs, practises or opinions are sought. Panel surveys involve the random selection of a small number of representative individuals from a group, who agree to be available over an extended period - often one to three years. During that period, they serve as a stratified random sample of people from whom data can be elicited on a variety of topics.
Forms are filled in by researchers, instead of respondents, and in that it differs from questionnaires. While this approach is more expensive, more complicated questions can be asked and data can be validated as it is collected, improving data quality. Interviews can be undertaken with variety of data sources fishers to consumers , and through alternative media, such as by telephone or in person.
Structured interviews form the basis for much of the data collection in small-scale fisheries. In an interview approach for sample catch, effort and prices, the enumerators work according to a schedule of landing site visits to record data.
Enumerators can be mobile that is sites are visited on a rotational basis or resident at a specific sampling site. The sample should be as representative as possible of fleet activities.
Some additional data related to fishing operations may be required for certain types of fishing units, such as beach seines or boats making multiple fishing trips in one day. For these, the interview may cover planned activities as well as activities already completed. Enumerators can be mobile that is homeports are visited on a rotational basis or resident at a specific sampling site.
In many cases, they combine the interview method with direct observations. Direct observations can be used to identify inactive fishing units by observing those that are moored or beached, and the total number of vessels based at the homeport are already known, perhaps from a frame survey or register. Often enumerators will still have to verify that vessels are fishing as opposed to other activities by using interviews during the visit.
The pure interview approach can be used in those cases where a pre-determined sub-set of the fishing units has been selected. The enumerator's job is to trace all fishers on the list and, by means of interviewing, find out those that had been active during the sampling day. For sites involving a workable number of fishing units e. Sometimes it is possible to ask questions on fishing activity which refer to the previous day or even to two days back.
This extra information increases the sample size significantly with little extra cost, ultimately resulting in better estimates of total fishing effort. In practice, observers do not only make direct measurements observations , but also conduct interviews and surveys using questionnaires.
They might also be involved in data processing and analysis. The tasks of an observer are difficult and adequate training and supervision are therefore essential. Clear decisions need to be made on the nature and extent of data collected during any one trip. Often, the amount of data and frequency of collection can be established analytically with preliminary data.
Preferably, observers should only collect data, not carry out other activities, such as enforcement, licensing or tax collection. This should help to minimise bias by reducing the incentives to lie. Problems in terms of conflicts between data collection and law enforcement, for example, can be reduced by clear demarcation, separating activities by location or time.
This becomes a necessity for at-sea observers. Their positions on fishing vessels and the tasks that they perform depend significantly on a good working relationship with the captain and crew, which can be lost if they are perceived as enforcement personnel. Data of this kind can sometimes be subjected to careful scrutiny, summary, and inquiry by historians and social scientists, and statistical methods have increasingly been used to develop and evaluate inferences drawn from such data.
Some of the main comparative approaches are'. Among the more striking problems facing the scientist using such data are the vast differences in what has been recorded by different agencies whose behavior is being compared this is especially true for parallel agencies in different nations , the highly unrepresentative or idiosyncratic sampling that can occur in the collection of such data, and the selective preservation and destruction of records.
Means to overcome these problems form a substantial methodological research agenda in comparative research. An example of the method of cross-national aggregative comparisons is found in investigations by political scientists and sociologists of the factors that underlie differences in the vitality of institutions of political democracy in different societies.
Some investigators have stressed the existence of a large middle class, others the level of education of a population, and still others the development of systems of mass communication. In cross-national aggregate comparisons, a large number of nations are arrayed according to some measures of political democracy and then attempts are made to ascertain the strength of correlations between these and the other variables. In this line of analysis it is possible to use a variety of statistical cluster and regression techniques to isolate and assess the possible impact of certain variables on the institutions under study.
While this kind of research is cross-sectional in character, statements about historical processes are often invoked to explain the correlations. Why did democracy develop in such different ways in America, France, and England? Why did northeastern Europe develop rational bourgeois capitalism, in contrast to the Mediterranean and Asian nations? Modern scholars have turned their attention to explaining, for example, differences among types of fascism between the two World Wars, and similarities and differences among modern state welfare systems, using these comparisons to unravel the salient causes.
The questions asked in these instances are inevitably historical ones. Historical case studies involve only one nation or region, and so they may not be geographically comparative.
However, insofar as they involve tracing the transformation of a society's major institutions and the role of its main shaping events, they involve a comparison of different periods of a nation's or a region's history. The goal of such comparisons is to give a systematic account of the relevant differences. Sometimes, particularly with respect to the ancient societies, the historical record is very sparse, and the methods of history and archaeology mesh in the reconstruction of complex social arrangements and patterns of change on the basis of few fragments.
Like all research designs, comparative ones have distinctive vulnerabilities and advantages: One of the main advantages of using comparative designs is that they greatly expand the range of data, as well as the amount of variation in those data, for study. Consequently, they allow for more encompassing explanations and theories that can relate highly divergent outcomes to one another in the same framework.
They also contribute to reducing any cultural biases or tendencies toward parochialism among scientists studying common human phenomena. One main vulnerability in such designs arises from the problem of achieving comparability.
For example, a vote in a Western democracy is different from a vote in an Eastern bloc country, and a voluntary vote in the United States means something different from a compulsory vote in Australia. These circumstances make for interpretive difficulties in comparing aggregate rates of voter turnout in different countries. The problem of achieving comparability appears in historical analysis as well. For example, changes in laws and enforcement and recording procedures over time change the definition of what is and what is not a crime, and for that reason it is difficult to compare the crime rates over time.
Comparative re- searchers struggle with this problem continually, working to fashion equivalent measures; some have suggested the use of different measures voting, letters to the editor, street demonstration in different societies for common variables. A second vulnerability is controlling variation. Traditional experiments make conscious and elaborate efforts to control the variation of some factors and thereby assess the causal significance of others.
In surveys as well as experi- ments, statistical methods are used to control sources of variation and assess suspected causal significance. In comparative and historical designs, this kind of control is often difficult to attain because the sources of variation are many and the number of cases few.
Scientists have made efforts to approximate such control in these cases of "many variables, small N. Another method is to select, for comparative purposes, a sample of societies that resemble one another in certain critical ways, such as size, common language, and common level of development, thus attempting to hold these factors roughly constant, and then seeking explanations among other factors in which the sampled societies differ from one another. Ethnographic Designs Traditionally identified with anthropology, ethnographic research designs are playing increasingly significant roles in most of the behavioral and social sciences.
The core of this methodology is participant-observation, in which a researcher spends an extended period of time with the group under study, ideally mastering the local language, dialect, or special vocabulary, and partic- ipating in as many activities of the group as possible. This kind of participant- observation is normally coupled with extensive open-ended interviewing, in which people are asked to explain in depth the rules, norms, practices, and beliefs through which from their point of view they conduct their lives.
A principal aim of ethnographic study is to discover the premises on which those rules, norms, practices, and beliefs are built. The use of ethnographic designs by anthropologists has contributed signif- icantly to the building of knowledge about social and cultural variation. One major trend concerns its scale. Ethnographic methods were originally developed largely for studying small-scale groupings known variously as village, folk, primitive, preliterate, or simple societies.
Over the decades, these methods have increasingly been applied to the study of small. The typical subjects of ethnographic study in modern society are small groups or relatively small social networks, such as outpatient clinics, medical schools, religious cults and churches, ethn- ically distinctive urban neighborhoods, corporate offices and factories, and government bureaus and legislatures. As anthropologists moved into the study of modern societies, researchers in other disciplines particularly sociology, psychology, and political science- began using ethnographic methods to enrich and focus their own insights and findings.
At the same time, studies of large-scale structures and processes have been aided by the use of ethnographic methods, since most large-scale changes work their way into the fabric of community, neighborhood, and family, af- fecting the daily lives of people. Ethnographers have studied, for example, the impact of new industry and new forms of labor in "backward" regions; the impact of state-level birth control policies on ethnic groups; and the impact on residents in a region of building a dam or establishing a nuclear waste dump.
Advances in structured interviewing see above have proven especially pow- erful in the study of culture. Techniques for understanding kinship systems, concepts of disease, color terminologies, ethnobotany, and ethnozoology have been radically transformed and strengthened by coupling new interviewing methods with modem measurement and scaling techniques see below. These techniques have made possible more precise comparisons among cultures and identification of the most competent and expert persons within a culture.
The next step is to extend these methods to study the ways in which networks of propositions such as boys like sports, girls like babies are organized to form belief systems. Much evidence suggests that people typically represent the world around them by means of relatively complex cognitive models that in- volve interlocking propositions.
The techniques of scaling have been used to develop models of how people categorize objects, and they have great potential for further development, to analyze data pertaining to cultural propositions. Ideological Systems Perhaps the most fruitful area for the application of ethnographic methods in recent years has been the systematic study of ideologies in modern society.
Earlier studies of ideology were in small-scale societies that were rather ho- mogeneous. In these studies researchers could report on a single culture, a uniform system of beliefs and values for the society as a whole. Modern societies are much more diverse both in origins and number of subcultures, related to. Yet these sub- cultures and ideologies share certain underlying assumptions or at least must find some accommodation with the dominant value and belief systems in the society.
The challenge is to incorporate this greater complexity of structure and process into systematic descriptions and interpretations. One line of work carried out by researchers has tried to track the ways in which ideologies are created, transmitted, and shared among large populations that have tradition- ally lacked the social mobility and communications technologies of the West.
This work has concentrated on large-scale civilizations such as China, India, and Central America. How are the ideological doctrines and cultural values of the urban elites, the great traditions, transmitted to local communities? How are the little traditions, the ideas from the more isolated, less literate, and politically weaker groups in society, transmitted to the elites? India and southern Asia have been fruitful areas for ethnographic research on these questions.
The great Hindu tradition was present in virtually all local contexts through the presence of high-caste individuals in every community. It operated as a pervasive standard of value for all members of society, even in the face of strong little traditions. The situation is surprisingly akin to that of modern, industrialized societies.
The central research questions are the degree and the nature of penetration of dominant ideology, even in groups that appear marginal and subordinate and have no strong interest in sharing the dominant value system. Historical Reconstruction Another current trend in ethnographic methods is its convergence with archival methods. One joining point is the application of descriptive and in- terpretative procedures used by ethnographers to reconstruct the cultures that created historical documents, diaries, and other records, to interview history, so to speak.
For example, a revealing study showed how the Inquisition in the Italian countryside between the s and s gradually worked subtle changes in an ancient fertility cult in peasant communities; the peasant beliefs and rituals assimilated many elements of witchcraft after learning them from their persecutors.
A good deal of social history particularly that of the fam- ily has drawn on discoveries made in the ethnographic study of primitive societies. As described in Chapter 4, this particular line of inquiry rests on a marriage of ethnographic, archival, and demographic approaches. A strikingly successful example in this kind of effort is a study of head-hunting. By combining an interpretation of local oral tradition with the fragmentary observations that were made by outside observers such as missionaries, traders, colonial officials , historical fluctuations in the rate and significance of head-hunting were shown to be partly in response to such international forces as the great depression and World War II.
Researchers are also investigating the ways in which various groups in contemporary societies invent versions of traditions that may or may not reflect the actual history of the group. This process has been observed among elites seeking political and cultural legitimation and among hard-pressed minorities for example, the Basque in Spain, the Welsh in Great Britain seeking roots and political mo-.
Ethnography is a powerful method to record, describe, and interpret the system of meanings held by groups and to discover how those meanings affect the lives of group members. It is a method well adapted to the study of situations in which people interact with one another and the researcher can interact with them as well, so that information about meanings can be evoked and observed.
By the same token, experimental, survey, and comparative methods frequently yield connections, the meaning of which is unknown; ethnographic methods are a valuable way to determine them.
Scientists continuously try to describe possible structures and ask whether the data can, with allowance for errors of measurement, be described adequately in terms of them. Over a long time, various families of structures have recurred throughout many fields of science; these structures have become objects of study in their own right, principally by statisticians, other methodological specialists, applied mathematicians, and philosophers of logic and science.
Methods have evolved to evaluate the adequacy of particular structures to account for particular types of data. In the interest of clarity we discuss these structures in this section and the analytical methods used for estimation and evaluation of them in the next section, although in practice they are closely intertwined.
A good deal of mathematical and statistical modeling attempts to describe the relations, both structural and dynamic, that hold among variables that are presumed to be representable by numbers. Such models are applicable in the behavioral and social sciences only to the extent that appropriate numerical.
In many studies the phenomena in question and the raw data obtained are not intrinsically nu- merical, but qualitative, such as ethnic group identifications.
The identifying numbers used to code such questionnaire categories for computers are no more than labels, which could just as well be letters or colors. One key question is whether there is some natural way to move from the qualitative aspects of such data to a structural representation that involves one of the well-understood numerical or geometric models or whether such an attempt would be inherently inappropriate for the data in question. The decision as to whether or not particular empirical data can be represented in particular numerical or more complex structures is seldom simple, and strong intuitive biases or a priori assumptions about what can and cannot be done may be misleading.
Recent decades have seen rapid and extensive development and application of analytical methods attuned to the nature and complexity of social science data.
Examples of nonnumerical modeling are increasing. Moreover, the wide- spread availability of powerful computers is probably leading to a qualitative revolution, it is affecting not only the ability to compute numerical solutions to numerical models, but also to work out the consequences of all sorts of structures that do not involve numbers at all. The following discussion gives some indication of the richness of past progress and of future prospects al- though it is by necessity far from exhaustive.
In describing some of the areas of new and continuing research, we have organized this section on the basis of whether the representations are funda- mentally probabilistic or not. A further useful distinction is between represen- tations of data that are highly discrete or categorical in nature such as whether a person is male or female and those that are continuous in nature such as a person's height.
Of course, there are intermediate cases involving both types of variables, such as color stimuli that are characterized by discrete hues red, green and a continuous luminance measure. Probabilistic models lead very naturally to questions of estimation and statistical evaluation of the correspon- dence between data and model. Those that are not probabilistic involve addi- tional problems of dealing with and representing sources of variability that are not explicitly modeled.
At the present time, scientists understand some aspects of structure, such as geometries, and some aspects of randomness, as embodied in probability models, but do not yet adequately understand how to put the two together in a single unibed model. Table outlines the way we have organized this discussion and shows where the examples in this section lie. Probability Models Some behavioral and social sciences variables appear to be more or less continuous, for example, utility of goods, loudness of sounds, or risk associated with uncertain alternatives.
Many other variables, however, are inherently cat-. And some variables, such as moral attitudes, are typically measured in research with survey questions that allow only categorical responses. Much of the early probability theory was formulated only for continuous variables; its use with categorical variables was not really justified, and in some cases it may have been misleading.
Recently, very sig- nificant advances have been made in how to deal explicitly with categorical variables. This section first describes several contemporary approaches to models involving categorical variables, followed by ones involving continuous repre- sentations.
Log-Linear Models for Categorical Variables Many recent models for analyzing categorical data of the kind usually dis- played as counts cell frequencies in multidimensional contingency tables are subsumed under the general heading of log-linear models, that is, linear models in the natural logarithms of the expected counts in each cell in the table.
These recently developed forms of statistical analysis allow one to partition variability due to various sources in the distribution of categorical attributes, and to isolate the effects of particular variables or combinations of them.
Present log-linear models were first developed and used by statisticians and sociologists and then found extensive application in other social and behavioral sciences disciplines. When applied, for instance, to the analysis of social mo- bility, such models separate factors of occupational supply and demand from other factors that impede or propel movement up and down the social hier- archy. With such models, for example, researchers discovered the surprising fact that occupational mobility patterns are strikingly similar in many nations of the world even among disparate nations like the United States and most of the Eastem European socialist countries , and from one time period to another, once allowance is made for differences in the distributions of occupations.
As another example of applications, psychologists and others have used log- linear models to analyze attitudes and their determinants and to link attitudes to behavior. These methods have also diffused to and been used extensively in the medical and biological sciences.
Regression Modelsfor Categorical Variables Models that permit one variable to be explained or predicted by means of others, called regression models, are the workhorses of much applied statistics; this is especially true when the dependent explained variable is continuous. For a two-valued dependent variable, such as alive or dead, models and ap- proximate theory and computational methods for one explanatory variable were developed in biometry about 50 years ago.
Computer programs able to handle many explanatory variables, continuous or categorical, are readily avail- able today. Even now, however, the accuracy of the approximate theory on. Using classical utility theory, economists have developed discrete choice models that turn out to be somewhat related to the log-linear and categorical regression models.
Models for limited dependent variables, especially those that cannot take on values above or below a certain level such as weeks unemployed, number of children, and years of schooling have been used profitably in economics and in some other areas. For example, censored normal variables called tobits in economics , in which observed values outside certain limits are simply counted, have been used in studying decisions to go on in school.
It will require further research and development to incorporate infor- mation about limited ranges of variables fully into the main multivariate meth- odologies. In addition, with respect to the assumptions about distribution and functional form conventionally made in discrete response models, some new methods are now being developed that show promise of yielding reliable in- ferences without making unrealistic assumptions; further research in this area. One problem arises from the fact that many of the categorical variables collected by the major data bases are ordered.
For example, attitude surveys frequently use a 3-, 5-, or 7-point scale from high to low without specifying numerical intervals between levels. Social class and educational levels are often described by ordered categories. Ignoring order information, which many tra- ditional statistical methods do, may be inefficient or inappropriate, but replac- ing the categories by successive integers or other arbitrary scores may distort the results.
For additional approaches to this question, see sections below on ordered structures. Regression-like analysis of ordinal categorical variables is quite well developed, but their multivariate analysis needs further research. New log-bilinear models have been proposed, but to date they deal specifically. Additional research extending the new models, improving computational algorithms, and integrating the models with work on scaling promise to lead to valuable new knowledge.
Models for Event Histories Event-history studies yield the sequence of events that respondents to a survey sample experience over a period of time; for example, the timing of marriage, childbearing, or labor force participation. Event-history data can be used to study educational progress, demographic processes migration, fertility, and mortality , mergers of firms, labor market behavior?
As interest in such data has grown, many researchers have turned to models that pertain to changes in probabilities over time to describe when and how individuals move among a set of qualitative states. Much of the progress in models for event-history data builds on recent developments in statistics and biostatistics for life-time, failure-time, and haz- ard models. Such models permit the analysis of qualitative transitions in a population whose members are undergoing partially random organic deterio- ration, mechanical wear, or other risks over time.
With the increased com- plexity of event-history data that are now being collected, and the extension of event-history data bases over very long periods of time, new problems arise that cannot be effectively handled by older types of analysis. Among the prob- lems are repeated transitions, such as between unemployment and employment or marriage and divorce; more than one time variable such as biological age, calendar time, duration in a stage, and time exposed to some specified con- dition ; latent variables variables that are explicitly modeled even though not observed ; gaps in the data; sample attrition that is not randomly distributed over the categories; and respondent difficulties in recalling the exact timing of events.
Models for Multiple-Item Measurement For a variety of reasons, researchers typically use multiple measures or multiple indicators to represent theoretical concepts. Sociologists, for example, often rely on two or more variables such as occupation and education to measure an individual's socioeconomic position; educational psychologists or- dinarily measure a student's ability with multiple test items.
Despite the fact that the basic observations are categorical, in a number of applications this is interpreted as a partitioning of something continuous. For example, in test theory one thinks of the measures of both item difficulty and respondent ability as continuous variables, possibly multidimensional in character. Classical test theory and newer item-response theories in psychometrics deal with the extraction of information from multiple measures.
Testing, which is a major source of data in education and other areas, results in millions of test. One goal of research on such test data is to be able to make comparisons among persons or groups even when different test items are used. Although the information collected from each respondent is intentionally incomplete in order to keep the tests short and simple, item- response techniques permit researchers to reconstitute the fragments into an accurate picture of overall group proficiencies.
These new methods provide a better theoretical handle on individual differences, and they are expected to be extremely important in developing and using tests.
For example, they have been used in attempts to equate different forms of a test given in successive waves during a year, a procedure made necessary in large-scale testing programs by legislation requiring disclosure of test-scoring keys at the time results are given.
The goal of this project is to provide accurate, nationally representative information on the average rather than individual proficiency of American children in a wide variety of academic subjects as they progress through elementary and secondary school.
This approach is an improvement over the use of trend data on uni- versity entrance exams, because NAEP estimates of academic achievements by broad characteristics such as age, grade, region, ethnic background, and so on are not distorted by the self-selected character of those students who seek admission to college, graduate, and professional programs.
Item-response theory also forms the basis of many new psychometric in- struments, known as computerized adaptive testing, currently being imple- mented by the U. In adaptive tests, a computer program selects items for each examiner based upon the examinee's success with previous items. Gen- erally, each person gets a slightly different set of items and the equivalence of scale scores is established by using item-response theory. Adaptive testing can greatly reduce the number of items needed to achieve a given level of meas- urement accuracy.
Nonlinear, Nonadditive Models Virtually all statistical models now in use impose a linearity or additivity assumption of some kind, sometimes after a nonlinear transformation of var- iables.
Imposing these forms on relationships that do not, in fact, possess them may well result in false descriptions and spurious effects. Unwary users, es- pecially of computer software packages, can easily be misled. But more realistic nonlinear and nonadditive multivariate models are becoming available. Exten- sive use with empirical data is likely to force many changes and enhancements in such models and stimulate quite different approaches to nonlinear multi- variate analysis in the next decade.
In some cases they are part of a probabilistic ap- proach, such as the algebraic models underlying regression or the geometric representations of correlations between items in a technique called factor anal- ysis. In other cases, geometric and algebraic models are developed without explicitly modeling the element of randomness or uncertainty that is always present in the data.
Although this latter approach to behavioral and social sciences problems has been less researched than the probabilistic one, there are some advantages in developing the structural aspects independent of the statistical ones. We begin the discussion with some inherently geometric rep- resentations and then turn to numerical representations for ordered data. Although geometry is a huge mathematical topic, little of it seems directly applicable to the kinds of data encountered in the behavioral and social sci- ences.
Nevertheless, since geometric representations are used to reduce bodies of data, there is a real need to develop a deeper understanding of when such representations of social or psychological data make sense. Moreover, there is a practical need to understand why geometric computer algorithms, such as those of multidimensional scaling, work as well as they apparently do. A better understanding of the algorithms will increase the efficiency and appropriate- ness of their use, which becomes increasingly important with the widespread availability of scaling programs for microcomputers.
Scaling Over the past 50 years several kinds of well-understood scaling techniques have been developed and widely used to assist in the search for appropriate geometric representations of empirical data. The whole field of scaling is now entering a critical juncture in terms of unifying and synthesizing what earlier appeared to be disparate contributions. Within the past few years it has become apparent that several major methods of analysis, including some that are based on probabilistic assumptions, can be unified under the rubric of a single gen- eralized mathematical structure.
For example, it has recently been demon- strated that such diverse approaches as nonmetric multidimensional scaling, principal-components analysis, factor analysis, correspondence analysis, and log-linear analysis have more in common in terms of underlying mathematical structure than had earlier been realized. Nonmetric multidimensional scaling is a method that begins with data about the ordering established by subjective similarity or nearness between pairs of stimuli.
The idea is to embed the stimuli into a metric space that is, a geometry. This method has been successfully applied to phenomena that, on other grounds, are known to be describable in terms of a specific geometric structure; such applications were used to validate the procedures.
Such validation was done, for example, with respect to the perception of colors, which are known to be describable in terms of a particular three-dimensional structure known as the Euclidean color coordinates. Similar applications have been made with Morse code symbols and spoken phonemes. The technique is now used in some biological and engineering applications, as well as in some of the social sciences, as a method of data exploration and simplification.
One question of interest is how to develop an axiomatic basis for various geometries using as a primitive concept an observable such as the subject's ordering of the relative similarity of one pair of stimuli to another, which is the typical starting point of such scaling.
The general task is to discover properties of the qualitative data sufficient to ensure that a mapping into the geometric structure exists and, ideally, to discover an algorithm for finding it. Some work of this general type has been carried out: But the more general problem of un- derstanding the conditions under which the multidimensional scaling algo- rithms are suitable remains unsolved.
In addition, work is needed on under- standing more general, non-Euclidean spatial models. Ordered Factorial Systems One type of structure common throughout the sciences arises when an ordered dependent variable is affected by two or more ordered independent variables.
This is the situation to which regression and analysis-of-variance models are often applied; it is also the structure underlying the familiar physical identities, in which physical units are expressed as products of the powers of other units for example, energy has the unit of mass times the square of the unit of distance divided by the square of the unit of time.
There are many examples of these types of structures in the behavioral and social sciences. One example is the ordering of preference of commodity bun- dles collections of various amounts of commodities which may be revealed directly by expressions of preference or indirectly by choices among alternative sets of bundles.
A related example is preferences among alternative courses of action that involve various outcomes with differing degrees of uncertainty; this is one of the more thoroughly investigated problems because of its potential importance in decision making. A psychological example is the trade-off be- tween delay and amount of reward, yielding those combinations that are equally reinforcing.
In a common, applied kind of problem, a subject is given descrip- tions of people in terms of several factors, for example, intelligence, creativity,. In all these cases and a myriad of others like them the question is whether the regularities of the data permit a numerical representation. Initially, three types of representations were studied quite fully: The first two representations underlie some psycholog- ical and economic investigations, as well as a considerable portion of physical measurement and modeling in classical statistics.
The third representation, averaging, has proved most useful in understanding preferences among un- certain outcomes and the amalgamation of verbally described traits, as well as some physical variables. For each of these three cases adding, multiplying, and averaging re- searchers know what properties or axioms of order the data must satisfy for such a numerical representation to be appropriate.
On the assumption that one or another of these representations exists, and using numerical ratings by sub- jects instead of ordering, a scaling technique called functional measurement referring to the function that describes how the dependent variable relates to the independent ones has been developed and applied in a number of domains. What remains problematic is how to encompass at the ordinal level the fact that some random error intrudes into nearly all observations and then to show how that randomness is represented at the numerical level; this continues to be an unresolved and challenging research issue.
During the past few years considerable progress has been made in under- standing certain representations inherently different from those just discussed. The work has involved three related thrusts. The first is a scheme of classifying structures according to how uniquely their representation is constrained.
The three classical numerical representations are known as ordinal, interval, and ratio scale types. For systems with continuous numerical representations and of scale type at least as rich as the ratio one, it has been shown that only one additional type can exist.
A second thrust is to accept structural assumptions, like factorial ones, and to derive for each scale the possible functional relations among the independent variables. And the third thrust is to develop axioms for the properties of an order relation that leads to the possible representations. Much is now known about the possible nonadditive representations of both the muItifactor case and the one where stimuli can be combined, such as.
Closely related to this classification of structures is the question: What state- ments, formulated in terms of the measures arising in such representations, can be viewed as meaningful in the sense of corresponding to something em- pirical? Statements here refer to any scientific assertions, including statistical ones, formulated in terms of the measures of the variables and logical and mathematical connectives.
In particular, statements that remain invariant under certain symmetries of structure have played an important role in classical geometry, dimensional analysis in physics, and in relating measurement and statistical models applied to the same phenomenon. In addition, these ideas have been used to construct models in more formally developed areas of the behavioral and social sciences, such as psychophysics.
Current research has emphasized the communality of these historically independent developments and is at- tempting both to uncover systematic, philosophically sound arguments as to why invariance under symmetries is as important as it appears to be and to understand what to do when structures lack symmetry, as, for example, when variables have an inherent upper bound.
Clustering Many subjects do not seem to be correctly represented in terms of distances in continuous geometric space. Rather, in some cases, such as the relations among meanings of words which is of great interest in the study of memory representations a description in terms of tree-like, hierarchial structures ap- pears to be more illuminating.
This kind of description appears appropriate both because of the categorical nature of the judgments and the hierarchial, rather than trade-off, nature of the structure.
Individual items are represented as the terminal nodes of the tree, and groupings by different degrees of similarity are shown as intermediate nodes, with the more general groupings occurring nearer the root of the tree. Clustering techniques, requiring considerable com- putational power, have been and are being developed. Some successful appli- cations exist, but much more refinement is anticipated.
Network Models Several other lines of advanced modeling have progressed in recent years, opening new possibilities for empirical specification and testing of a variety of theories. In social network data, relationships among units, rather than the units themselves, are the primary objects of study: Special models for social network data have been developed in the past decade, and they give, among other things, precise new measures of the strengths of relational ties among units.
A major challenge in social network data at present is to handle the statistical depend- ence that arises when the units sampled are related in complex ways. Some issues of inference and analysis have been dis-.
This section discusses some more general issues of statistical inference and advances in several current approaches to them. Causal Inference Behavioral and social scientists use statistical methods primarily to infer the effects of treatments, interventions, or policy factors.
Previous chapters in- cluded many instances of causal knowledge gained this way. As noted above, the large experimental study of alternative health care financing discussed in Chapter 2 relied heavily on statistical principles and techniques, including randomization, in the design of the experiment and the analysis of the resulting data.
Sophisticated designs were necessary in order to answer a variety of questions in a single large study without confusing the effects of one program difference such as prepayment or fee for service with the effects of another such as different levels of deductible costs , or with effects of unobserved variables such as genetic differences.
Statistical techniques were also used to ascertain which results applied across the whole enrolled population and which were confined to certain subgroups such as individuals with high blood pres- sure and to translate utilization rates across different programs and types of patients into comparable overall dollar costs and health outcomes for alternative financing options.
A classical experiment, with systematic but randomly assigned variation of the variables of interest or some reasonable approach to this , is usually con- sidered the most rigorous basis from which to draw such inferences.
But ran- dom samples or randomized experimental manipulations are not always fea- sible or ethically acceptable. Then, causal inferences must be drawn from observational studies, which, however well designed, are less able to ensure that the observed or inferred relationships among variables provide clear evidence on the underlying mechanisms of cause and effect.
Certain recurrent challenges have been identified in studying causal infer- ence. One challenge arises from the selection of background variables to be measured, such as the sex, nativity, or parental religion of individuals in a comparative study of how education affects occupational success. The adequacy of classical methods of matching groups in background variables and adjusting for covariates needs further investigation.
Statistical adjustment of biases linked to measured background variables is possible, but it can become complicated. Current work in adjustment for selectivity bias is aimed at weakening implau- sible assumptions, such as normality, when carrying out these adjustments. Even after adjustment has been made for the measured background variables, other, unmeasured variables are almost always still affecting the results such as family transfers of wealth or reading habits.
Analyses of how the conclusions might change if such unmeasured variables could be taken into account is. The third important issue arises from the necessity for distinguishing among competing hypotheses when the explanatory variables are measured with dif- ferent degrees of precision.
Both the estimated size and significance of an effect are diminished when it has large measurement error, and the coefficients of other correlated variables are affected even when the other variables are meas- ured perfectly. Similar results arise from conceptual errors, when one measures only proxies for a theoretical construct such as years of education to represent amount of learning.
In some cases, there are procedures for simultaneously or iteratively estimating both the precision of complex measures and their effect. On a particu tar criterion. Although complex models are often necessary to infer causes, once their output is available, it should be translated into understandable displays for evaluation Results that depend on the accuracy of a multivariate model and the associated software need to be subjected to appropriate checks, including the evaluation of graphical displays, group comparisons, and other analyses.
New Statistical Techniques Internal Resampling One of the great contributions of twentieth-century statistics was to dem- onstrate how a properly drawn sample of sufficient size, even if it is only a tiny fraction of the population of interest, can yield very good estimates of most population characteristics. When enough is known at the outset about the characteristic in question for example, that its distribution is roughly nor- mal inference from the sample data to the population as a whole is straight- forward, and one can easily compute measures of the certainty of inference, a common example being the 9S percent confidence interval around an estimate.
But population shapes are sometimes unknown or uncertain, and so inference procedures cannot be so simple. Furthermore, more often than not, it is difficult to assess even the degree of uncertainty associated with complex data and with the statistics needed to unravel complex social and behavioral phenomena. Internal resampling methods attempt to assess this uncertainty by generating a number of simulated data sets similar to the one actually observed.
Data Collection is an important aspect of any type of research study. Inaccurate data collection can impact the results of a study and ultimately lead to invalid results. Data collection methods for impact evaluation vary along a continuum.
DATA COLLECTION Research methodology A brief and succinct account on what the techniques for collecting data are, how to apply them, where to Magister “Civilisation: find data of any type, and the way to keep records for language and Cultural an optimal management of cost, time and effort. Studies.
Chapter 9-METHODS OF DATA COLLECTION 1. METHODS OF DATA COLLECTION 2. What is data collection? The process by which the researcher collects the information needed to answer the research . Data Collection Techniques. Information you gather can come from a range of sources. Likewise, there are a variety of techniques to use when gathering primary data. Responses can be analyzed with quantitative methods by assigning numerical values to Likert-type scales; Consists of examining existing data in the form of databases.
Types of Data Collection Methods for Research Type Lister Types of Human Behavior 0 Research can be defined as the process of gathering facts and information is a structured manner to understand a subject matter in more depth. Methods of data collection 1. The task of data collection begins after a research problem has been defined and research design /plan chalked out. 3. TYPES OF DATA1) PRIMARY DATA: Are those which are collected a fresh and for the first time and thus happen to be original in character and known as Primary data.2) SECONDARY DATA: Are those.