
Summary
Recently with the availability of low cost powerful computers and with the strict requirements and inflexibility of classical parametric methods much attention have been turned on alternative procedures. In the Fisheries Science, for example size frequency is analyzed by using histogram and frequency polygon. However, these density estimators present several drawbacks including: dependency on the grid origin, discontinuity, and use of fixed grid points intervals. These problems have motivated the interest in alternative more efficient, although computationally intensive methods. In this study several nonparametric methods are briefly introduced and applied to fisheries biology data.
A general use nonlinear resistant smoother (based on repeated running medians) proved to be an efficient method to reduce the noise of minimally grouped size data. However, drawbacks that limit seriously its use in practice are the dependency on the original scheme used for grouping, the requirement of an adequate number of intervals according to the smoother span and the increase of the variability (standard deviation) of data with Gaussian distribution.
The kernel density estimators can be used to ameliorate distributional noise and eliminates totally the origin dependence problem presented by histograms and frequency polygons. Furthermore, the adaptive kernel estimator uses a variable bandwidth that is wider in zones with low data density and narrower where the data values concentrates avoiding the inconvenience of using a fixed width value. In spite of their improvements to estimate density, they are sensitive to the choice of the bandwidth (the smoothing parameter) and are very computationally intensive. The benefits of these more flexible methods come at the cost of greater computation. Applied to some published data sets, both the nonlinear resistant smoother and the Gaussian adaptive density estimator produced similar estimates by using a modified Bhattacharya’s procedure to identify Gaussian components.
Average shifted histograms and the general weighted averaging of rounding points (WARP) which constitutes a collection of efficient algorithms based on discretizing data accelerates kernel density estimation in a remarkable way. This is particularly important when resampling techniques are used to determine the smoothing parameter (cross-validation techniques), to assess the existence of modes (Silverman’s smoothed bootstrapped test) or to construct error bars. The problem of the bandwidth selection remains, but a powerful collection of selection rules have been developed in order to minimize the error estimate.
In order to assess the existence of multimodality there have been introduced the DIP statistic test for unimodality and the Silverman’s smoothed bootstrapped multimodality test for Gaussian kernel density estimator. This procedure was applied to the analysis of length frequency data of the estuarine catfish Arius melanopus (n= 1116 females and juveniles) collected in a coastal lagoon in the North-Western Gulf of Mexico collected during a previous research. For this data set the DIP statistic rejected the hypothesis of unimodality. The bootstrap test provided solid evidence of the multimodality indicating at least 4 modes, which correspond with former analysis.
A related derived collection of flexible procedures is included in the general term nonparametric regression, which is employed with complicated functions that are difficult to calculate directly. The kernel regression via discretized and WARPing algorithms and the k-Nearest Neighbor estimator were applied to data simulated by using the modified von Bertalanffy growth function accounting for seasonal cessation of growth. The recovering of the function was notably by using the bandwidth minimizing the adjusted prediction error employing several penalized functions.
There is a lack of computerized methods to calculate all these alternative procedures, and in this study I present a full collection of programs to estimate univariate kernel density estimators by means of direct computationally intensive methods, discretized procedures and the efficient ASH-WARPing algorithm. Besides some nonparametric regression procedures including kernel and nearest neighbor regression were implemented. These programs were written using the Stata statistical software programming language (macros) and in Turbo Pascal running under MSDOS based PC computers. The macros for Stata take advantage of the built in graphical capabilities. All the computation of the nonparametric methods used in this study were performed by using the developed routines, except for the DIP statistic.
The Japanese sea bass Lateolabrax japonicus “suzuki” is a very important species inhabiting Japanese waters highly appreciated as a food fish. Besides it is regarded as a highly promising species for sea farming in winter. This species has been the focus of several studies regarding biological and ecological aspects. No previous report on a comparative study of several hardparts for age and growth determination was found. For the present study, samples from the commercial catch at Tokyo Bay were obtained in approximately monthly periods from September, 1993 to March, 1995. Besides, specimens collected during the surveys of the Laboratory of Fisheries Biology (LFB) were included. A total of 406 individuals was analyzed (109 males, 114 females and 83 unknown) ranging from 162-664, 155-760, and 123-366 mm of standard body length respectively. Adult males and females were obtained along the period of study from the commercial catch, while the juveniles were caught only during the winter months by the LFB survey trips. The fish were kept frozen and after thawing all the specimens were measured, weighed and sexed; scales were collected from four body region for comparative purposes and both otoliths were removed.
Biometric relationships were explored with the nonparametric procedures described above and in conjunction with ANCOVA they provide assistance in the final parametric model choice. Nonparametric regression methods (Nadaraya-Watson and k-NN estimators) applied to length-weight data suggested a distinctive relationship for small and adult fish. Analysis of covariance revealed significant difference between these two groups of individuals. Therefore a two phase regression was fitted to logarithmic length weight relationship grouping separately immature individuals and adults.
Length frequency of males and females including (by separate) unknown sex individuals was analyzed by means of several nonparametric density estimators: histograms, frequency polygons, averaged shifted histograms and kernel density estimators. Furthermore, the structure of length distribution was investigated by means of the application of some optimal and oversmoothed bandwidth rules. Besides, the bandwidth minimizing the estimate error was investigated by means of the least squares and biased cross-validation technique implemented with the computationally efficient ASH-WARPing algorithm.
The modality of length distribution was analyzed using nonparametric tests: the DIP test for unimodality, and the smoothed bootstrapped multimodality test of Silverman. To carry out this particular multimodality test, the availability of a computational efficient kernel estimator was indispensable due to the necessity to evaluate at least one hundred of density estimates for each critical bandwidth. The DIP statistic rejected the hypotheses of unimodality and the bootstrap test provided a lower bound of (at least) four modes both for males and females. However, the test suggests that one or two more modes are possible. The biased cross-validation bandwidth estimate corresponded closely with the density estimation suggested by the bootstrapped test. On the other hand, the estimation using the least squares cross validation rule was notably undersmoothed. In the female-unknown sex case, decreasing further the bandwidth produced the decomposition of the main mode in two, strongly indicating the existence of two overlapped groups of fish. Age determination permitted to confirm that the overlapped mode was composed by fish 1 and 2 years old.
Age and growth was analyzed by means of scales, whole otoliths and otolith sections. The body zones considered for scale sampling were: Zone 1, Body side Immediately behind the pectoral fin; Zone 2, Nape; Zone 3, Belly immediately behind the pelvic fin; and Zone 4, Body side immediately above the lateral line and below the first dorsal fin. With a few exceptions all the scales samples were taken from the left side of the fish. A total of 2804 scales from the four body zones were measured in order to set the body length-scale length relationship. From each slide with mounted scales only those not regenerated and without damages were considered. Of the body regions of the fish considered for sampling scales the zone immediately above the lateral line and below the first dorsal fin (Zone 4) exhibited the lowest standard errors in the body-scale relationship; besides, the scales from this body zone were in general easier to read than the others (they displayed clearly annuli, and fewer check marks). This is the zone considered by several other authors so it was preferred over the others for age determination and result comparison.
In regard with the whole otolith reading, in total 1245 measurements were carried out. There were numerous false (non-seasonal) rings formed along the anterior and posterior axis. These made difficult to determine the beginning of the opaque zone and caused high variability in the measurements. In the oldest specimens it was notably the agglomeration of the seasonal rings, and this was the cause that in the dorsal and ventral axis the radii corresponding to older ages were neither detected nor measured. The axis with the less agglomeration of rings at older ages was the anterior. The difficulty that this axis presents for the measurements is the numerous false rings it contains. The rings above 8 were detected only in the anterior-dorsal axis, but it was not possible to accurately measure the corresponding radii along a straight line. The ventral axis is not suitable to age fish older than a two or three years due to the agglomeration of annuli in the margin.
The right otolith was embedded in polyester resin, and sectioned transversally with a precision cutting machine. There were carried out 1060 measurements considering two axes from the center: along the sulcal groove, and to the ventral margin. In general, otolith sections were the easiest structure to read. However was the hard part requiring the most elaborate preparation. The sections manifested clearly the reason why it is not possible to recognize annuli beyond 5 or 6 in the dorsal axis. This is caused for a change in the direction of growth of the otolith.
The percentage of agreement between structures was as follows (scales from body zone 4): scales -whole otoliths 67%; whole otoliths – otolith sections, 59%; scales – otolith sections, 66%. The agreement between scales and otolith sections was poor at older ages but surprisingly, in global terms was slightly higher than the agreement between whole and sectioned otoliths. One reason for the otolith methods difference was the presence of many false annuli that lead to more variable measurements. The sections were easy to read but it was difficult to identify opaque bands at the margins. In contrast with sections, both scales and whole otoliths provided younger ages for the older fish. Compared with other localities mean length at age for this study were larger at smaller sizes. However, due to the limited number of older specimens showed in general smaller body lengths. The main portion of the fish examined were immature, one and two years old specimens. The older fish analyzed was a female 14 years old with 850 mm of total length. For the three methods of aging, von Bertalanffy growth equations were fitted to pairs of age-length values and pairs of age and mean length at age by means of a nonlinear regression algorithm. In general terms there were a good agreement between the fitting methods.
To study the seasonal variation of several body state indicator variables, a multivariate analysis of covariance were carried out. In the design there were considered three dependent variables: total weight, gonad weight and liver weight. The curvilinear relationships among weight and length variables were linearized by means of a logarithmic transformation. It was attempted to use two factors (sex and sampling date) and one covariate (standard body length), but only the designs with one single factor (sampling date) lead to valid results. Then a preliminary design with interaction was calculated and produced non significant interactions. Therefore, the design with the factor and the covariate was calculated. The results made manifest the winter reproductive season with high mean gonad weights. One exception is the high mean for April fish, when a small set of fish with big and heavy gonads was registered. A trend opposite to this behavior was presented by the mean liver weight, and highest stomach weights during spring.