Equivalence and Standard Scores of the Hurlbert Index of Sexual Assertiveness Across Spanish Men and Women

Título: Equivalencia y Baremos del Hurlbert Index of Sexual Assertiveness Entre Hombres y Mujeres Españoles. Resumen: El objetivo de este estudio fue analizar la presencia de invarianza de medida y funcionamiento diferencial del ítem de la versión española del Hurlbert Index of Sexual Assertiveness en función del sexo. La muestra estuvo compuesta por 1.600 mujeres y 1.598 hombres de España, con edades comprendidas entre los 18 y 84 años. El Hurlbert Index of Sexual Assertiveness solo mostró invarianza débil entre hombres y mujeres. El análisis de funcionamiento diferencial del ítem mostró que únicamente el ítem 2 (“Creo que soy tímido en el ámbito sexual”) mostró funcionamiento diferencial uniforme moderado. De forma más específica, las mujeres tendieron a responder “Siempre” a este ítem de forma más frecuente que los hombres. Los resultados sugirieron eliminar el ítem 2 resultando en una versión final compuesta por 18 ítems agrupadas en dos dimensiones, con buenos índices de fiabilidad tanto para hombres como para mujeres. Los baremos obtenidos para la escala de Inicio y Ausencia de timidez/Rechazo reflejaron la existencia de roles sexuales tradicionales en hombres y mujeres. Palabras clave: Asertividad sexual; Hurlbert Index of Sexual Assertiveness; invarianza de medida; funcionamiento diferencial del ítem; baremos. Abstract: The purpose of the present study was to analyze the measurement invariance and differential item functioning of the Spanish version of the Hurlbert Index of Sexual Assertiveness across gender. The sample was composed of 1,600 women and 1,598 men from Spain, with ages ranging from 18 to 84 years old. The Hurlbert Index of Sexual Assertiveness only showed weak invariance for men and women. The differential item functioning analysis showed that only item 2 (“I feel that I am shy when it comes to sex”) flagged moderate uniform differential item functioning. More specifically, women tended to respond “Always” to this item more frequently than did men. Results strongly suggested eliminating item 2, resulting in a final version with 18 items clustered into two dimensions with good reliability values for men and women. Standard scores for both Initiation and No Shyness/Refusal reflected traditional sexual scripts for men and women.


Introduction
Sexual assertiveness has been defined in a variety of ways.Painter (1997) stated that sexual assertiveness is the ability to develop assertive behaviors in a sexual context.Dunn, Lloyd, and Phelps (1979) noted that it involves using "behavioral skills to obtain sexual satisfaction for yourself and your partner" (p.294).Morokoff et al. (1997) provided a clearer picture of sexual assertiveness by stating that it embraces the ability to initiate desired sexual contacts, refuse unwanted sexual contacts, and the ability to prevent pregnancy or STIs with a regular partner.In line with this definition, several studies have explored the relevance of sexual assertiveness for human sexual life (for a review, see Santos-Iglesias & Sierra, 2010a) and concluded that it helps develop sexual healthy behaviors (e.g., use of condom) and obtain greater sexual satisfaction.Finally, sexual assertiveness training programs help promote positive sexual outcomes and behaviors (Kelly, St. Lawrence, Hood, & Brasfield, 1989;Murphy, Coleman, Hoon, & Scott, 1980, St. Lawrence et al., 1995).
According to the sexual script theory (Simon & Gagnon, 1984, 1986, 2003), men are typically initiators of sexual encounters, while women are supposed to be restrictors of such contacts.Thus, men should score high on initiation sexual assertiveness (i.e., the ability to initiate desired sexual contacts) while women should score high on refusal sexual assertiveness (i.e., the ability to refuse undesired sexual contacts).This traditional sexual script has generated some research to analyze whether men or women scored higher on sexual assertiveness.In general, results have usually found that men scored higher than women on sexual assertiveness (Haavio-Mannila & Kontula, 1997;Pierce & Hurlbert, 1999;Snell, Fisher, & Miller, 1991), although results have been mixed (Stulhofer, Graham, Bozicevic, Kufrin, & Ajdukovic, 2007).For example, Pierce and Hurlbert (1999) interviewed 54 non-clinical individuals and 46 clinical individuals attending sex therapy and showed that men in both clinical and non-clinical samples scored higher on sexual assertiveness than women.On the other hand, Sutlhofer et al. (2007) interviewed a nationally representative sample of young men and women and found that women scored higher than men on sexual assertiveness.These results can be explained by the fact that the studies by Hurlbert et al. and Snell et al. were based on sexual assertiveness scores mostly composed of initiation items, while Stulhofer et al. used refusal assertiveness items (A.Stulhofer, personal communication, March 22, 2011).Moreover, a study by Sierra, Santos-Iglesias, and Vallejo-Medina (2012) showed that, as age increased, initiation sexual assertiveness was higher in men compared to women.These authors also found that refusal sexual assertiveness was higher in women than men regardless of age.These results suggest that sexual assertiveness might follow traditional sexual scripts.They also noted that men and women have usually been compared on the basis of their sexual assertiveness.However, only the study by Sierra, Santos-Iglesias, et al. tested for measurement invariance and differential item functioning of one of these sexual assertiveness measures.They found that the Sexual Assertiveness anales de psicología, 2014, vol.30, nº 1 (enero) Scale had a strict equivalent dimensionality across sexes and only one item flagged differential item functioning, so they concluded that there is no significant bias in the scale when comparing sexual assertiveness across sexes.
Measurement invariance means that the probability of an observed score does not depend on the person's group membership (Meredith, 1993), that is: "respondents from different groups, but with the same true score, will have the same observed score" (Wu, Li, & Zumbo, 2007, p. 2).This concept implies that measuring constructs with the same instrument will reflect differences based on the performance/attribute between groups, and not differences based on confounding variables.Differential item functioning (DIF) is related to the conditional probability of answering an item in two or more groups after matching on the underlying ability (Hidalgo & Gómez, 2006;Zumbo, 1999).In the context of sexual assertiveness, for example, measurements should be invariant and show lack of DIF for comparisons between men and women to really reflect differences in sexual assertiveness and not differences based on sexist items or item comprehension, for example.Both procedures are strongly related (Dimitrov, 2010;Holland & Wainer, 1993) and are supposed to be tested together as evidence of validity, especially when test scores are used to compare groups.
The Hurlbert Index of Sexual Assertiveness (HISA; Hurlbert, 1991) is one of the instruments used most frequently to assess sexual assertiveness (Santos-Iglesias & Sierra, 2010a).In its original version, it was composed of 25 items providing an one-dimensional measure of sexual assertiveness in couples.The Spanish adaptation was shortened to a 19-item version clustered into two dimensions: (a) Initiation, which reflects the ability to begin sexual contacts and to express sexual desires and fantasies; and (b) No Shyness/Refusal, which means the difficulty starting and maintaining sexual conversations and the inability to reject undesired sexual contacts (Santos-Iglesias & Sierra, 2010b).Although the HISA has shown adequate psychometric properties (Santos-Iglesias & Sierra, 2010b;Sierra, Santos, Gutiérrez-Quintanilla, Gómez, & Maeso, 2008) and has been used to compare men and women (see Pierce & Hurlbert, 1999), no studies have tested whether its psychometric properties are the same for men and women.Thus, the main aim of the present study was to assess the measurement invariance and DIF of the Hurlbert Index of Sexual Assertiveness across gender using a Spanish sample.Due to the lack of normative data and its potential usefulness for clinical and epidemiological assessments, standard scores were developed for both the Initiation and No Shyness/Refusal subscales for both men and women across three different age groups (18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47)(48)(49), and 50 years old or older).

Method Participants
Participants were recruited from the general population in Spain.The final sample was composed of 1,598 men and 1,600 women.Since the HISA assesses sexual assertiveness within partners, participants were required to be involved in a romantic relationship that included sexual activity at the time of the study.The mean age of men was 39.47 years (SD = 13.38,range 18-81), while that of women was 36.98 years (SD = 13.41,range 18-84).Educational level, religion, and frequency of religious practice are reported in Table 1.

Measures
A background questionnaire was administered to obtain information about sex, age, whether participants were involved on a romantic relationship, whether they had sexual activity with their partners, educational level, religion, and frequency of religious practice.

Procedure
Participants were recruited from the Spanish general population.A quota convenience sampling method was used to obtain the same number of men and women, distributed across different groups according to age (18-34 years old, 35-49 years old, and 50 years old or older), size of the town or city of residence (a population lesser than 50,000 and greater than 50,000), and geographical area (north and south of Spain).Participants were required to be involved in a stable heterosexual relation with sexual activity (because sexual assertiveness implies negotiation of sexual behaviors) for at least 6 months at the time of the study.Testing was conducted individually in different settings by well-trained researchers (public libraries, social centers, and public places).In university classrooms, participants were tested collectively.The purpose of the study was briefly explained to all participants.Verbal informed consent was obtained, and anonymity and confidentiality were guaranteed, as well as the exclusive use of the tests for research purposes.

Data analysis
Measurement invariance was tested using LISREL 8.51 (Jöreskog & Sörbom, 2001) following the procedure described by Wu et al. (2007) for multi-group confirmatory factor analysis (MG-CFA).Four models were assessed: (a) configural invariance constrained the number of factors and the pattern of free and fixed loadings across both groups; (b) weak invariance tested equality of factor loadings across groups; (c) strong invariance tested equality of intercepts for both groups; and (d) strict invariance assumed that residual variances for all items were equal across groups.These four steps were estimated using maximum likelihood.In order to avoid problems with sample size, three main indices were used to assess adjustment: the Root Mean Square Error of Approximation (RMSEA), Non-Normed Fit Index (NNFI), and Comparative Fit Index (CFI).In this context, NNFI and CFI values above .90and RMSEA values below .05were used as indicators of good fit (Cheung & Rensvold, 2002, Wu et al., 2007).Additionally, to assess the fit of nested models -such as the MG-CFA-, changes in the fit indices were examined (Cheung & Rensvold, 2002;Wu, et al., 2007).Cheung and Rensvold (2002) recommended using ∆CFI and proposed ∆CFI  -.01 as a good indicator of measurement invariance.
Step 1 tested the contribution of each subscale score (Initiation and No shyness/Refusal).Step 2 tested whether item score significantly contributed to differences between men and women (dependent variable), and Step 3 tested the interaction between subscale score and item score.Significance of Step 2 -Step 1 (Step 2 itself) indicated uniform DIF, while significance of Step 3 -Step 2 (Step 3 itself) was considered evidence of non-uniform DIF.Effect size was tested through the increase in Nagelkerke's R 2 , so that values up to .035indicated negligible DIF, values between .035 and .070showed moderate DIF, and values above .070indicated large DIF (Jodoin & Gierl, 2001).A stepwise purification procedure was performed for all the items showing DIF.Finally, to analyze the response category in which the DIF did exist, a discriminant logistic analysis using a cumulative probability model was performed on each item showing DIF (Mellenberg, 1995).

Measurement invariance
Measurement invariance started by testing configural invariance.Results showed that the model was the same for men and women (see Table 2).Although the  2 value was extremely high due to the large sample size, CFI, and RMSEA showed adequate fit.Step 2 involved testing whether weak invariance, or factor loading equivalence, was supported.The NNFI, CFI showed good fit, and RMSEA were close to a good fit, and the increase in CFI was -.002, indicating good fit for nested models between model 1 and model 2. Step 3 tested strong invariance or equivalence of intercepts across groups.Results showed an increase in the RMSEA and a decrease in the GFI, NNFI, and CFI, so that strong invariance is not supported.Furthermore, changes in the CFI reached .023,which meant that this nested model did not fit the data and therefore that strong invariance was not supported.At this point, the modification indices in the Tau-x matrix were assessed and revealed that items 2, 9, and 13 had large modification values (110.19, 62.58, and 57.39, respectively) and large expected change values too (.149; -.152; and -.149, respectively).This suggested testing strong invariance again without restrictions for these three items.Although results showed a slight non-significant decrease in the CFI (CFI = -.01),values of the NNFI, CFI, and RMSEA did not supported partial strong invariance either.Since strong invariance and partial strong invariance are not supported, we did not check for strict invariance across gender (Byrne & van de Vijver, 2010;Wu et al., 2007).

Differential item functioning
As shown in Table 3, the only item flagging moderate uniform DIF across gender was item 2 (R 2 = .059;"I feel that I am shy when it comes to sex").The purification process showed that, after deleting item 2 from the matching score, uniform DIF was still moderate ( 2 = 144.55,p < .001,R 2 = .059).Results of the discriminant logistic analysis performed on response scale categories revealed moderate uniform DIF in response category 4 (always) for item 2 ( 2 = 158.56,p < .001,R 2 = .065),indicating that women chose this anchor more frequently than men (OR = 0.37).

Standard scores and internal consistency
Standard scores for Initiation and No Shyness/Refusal were created from z score transformations due to the violation of normality (see Table 4 and Table 5, respectively).Cronbach's alpha values for both subscales are shown in parenthesis in Table 4 and Table 5.It must be noted that, according to the DIF results, item 2 was eliminated from the No Shyness/Refusal subscale before calculating standard scores.Results showed that men tended to report higher scores than women on initiation assertiveness across all ages.
Regarding the No Shyness/Refusal subscale, young women scored slightly higher than younger men, but older women scored higher than younger men (see Figure 1 and Figure 2).

Discussion
When assessment instruments are used to compare groups (i.e., cultures, gender, etc.) it is essential for such instruments to operate in the same way for each group (Dimitrov, 2010).
The main purpose of the present study was to analyze the measurement invariance and differential item functioning of the Spanish version of the Hurlbert Index of Sexual Assertiveness (HISA; Santos-Iglesias & Sierra, 2010b), because it is a construct that has typically been compared across men and women.The Spanish version of the HISA only showed weak measurement invariance.Only the item 2 ("I feel that I am shy when it comes to sex") flagged moderate uniform DIF.Thus, we highly recommend eliminating this item from the scale.
Regarding measurement invariance, results show that the model proposed by Santos-Iglesias and Sierra (2010b) is the same for men and women, as proven by the configural invariance test.Further, not only is the structure the same, but factor loadings are also equivalent across gender.On the other hand, results failed to support both strong and strict invariance.Thus, since strong and, specially, strict invariance are not satisfied (Wu et al., 2007) we can not assume the Spanish version of the HISA to be invariant across sexes.This result is particularly relevant when we want to compare the sexual assertiveness scores of men and women using this scale.In such a case, we deeply encourage to use the Spanish version of the Sexual Assertiveness Scale (Sierra et al., 2011), which has been demonstrated to be invariant between men and women (Sierra et al., 2012).
Differential item functioning revealed that item 2 showed uniform differential item functioning, which means that men and women have different probabilities of endorsing a response even if they belong to the same attribute level.More specifically, women have a greater probability of responding "Always" to item 2 ("I feel that I am shy when it comes to sex") compared to men.These results are related with traditional sexual scripts and gender-role stereotypes, in which women are supposed to follow traditionally feminine attributes like being sympathetic or shy (Bem, 1974;Holt & Ellis, 1998) and are encouraged not to talk overtly about sex (Quina, Harlow, Morokoff, Burkholder, & Deiter, 2000).Based on these results, we propose that item 2 should be eliminated from the Spanish version of the HISA, resulting in a 18-item questionnaire, clustered into two different factors: Initation (8 items) and No shyness / Refusal (10 items).
Finally, standard scores are provided.Results of mean scores reveal that assertiveness still follows traditional sexual scripts and gender-role stereotypes, especially among older participants.According to this, men assertively initiate sexual contacts more frequently than women (Haavio-Mannila & Kontula, 1997;López, Carcedo, Fernández-Rouco, Blázquez, & Kilani, 2011;Pierce & Hurlbert, 1999;Snell, et al., 1991) because they are supposed to initiate sexual contacts while women are supposed to act as restrictors of such contacts (Simon & Gagnon, 1984, 1986, 2003).In addition, young women scored slightly higher than young men on the No Shyness/Refusal subscale, which indicates that young women are less shy and refuse sexual contacts more often than young men.Regarding older men and women, results reveal that older women are shyer and less able to refuse undesired sexual contacts than older men.These results, although contrary to traditional sexual scripts, are consistent with some gender stereotypes, such as shyness in women (Bem, 1974;Holt & Ellis, 1998) actually show that sexual assertive skills were not traditionally taught to women (Muehlenhard & McCoy, 1991).This is particularly true in the case of Spanish women, who were taught to be "good wives" and comply with their partners' sexual desires in the past (Vázquez García & Moreno Mengíbar, 1997).
Some implications of these results must be noted.First, the factor structure found by Santos-Iglesias and Sierra (2010b) has been replicated in a sample of Spanish men and women, which is an indicator of construct validity of the scale.Second, although the factor structure is replicated the results from this study demonstrate that this scale has no measurement invariance between men and women.Thus, it is not recommended to use this scale when the purpose of the study is to compare male and female scores.In such a case, it is possible to use the Spanish version of the Sexual Assertiveness Scale (Sierra et al., 2011), whose equivalence across gender has been proven.Third, standard scores provided here are useful tools for clinicians and applied psychologists who want to assess individuals' sexual assertiveness.Finally, some limitations must be noted.For example, results are based on a non-representative sample with a large proportion of participants with high educational level, which implies that these results cannot be generalized to the entire Spanish population.Second, such results only aply to the Spanish version of the HISA, so no inferences can be made about the original English version.
(2010b) was used.It includes 19 items clustered into two factors: Initiation and No Shyness/Refusal.Participants responded using a 5-point Likert scale from 0 (never) to 4 (always).Higher scores indicated greater initiation assertiveness (Initiation subscale), and lack of shyness and greater refusal assertiveness (No Shyness/Refusal subscale).Santos-Iglesias and Sierra reported an internal consistency (McDonald's omega) of .83 for each factor and .87 for the global scale.It showed positive correlations above .10with the Spanish version of the Sexual Assertiveness Scale (Sierra, Vallejo-Medina, & Santos-Iglesias, 2011) and above .18with the Spanish abbreviated version of the Dyadic Adjustment Scale (Santos-Iglesias, Vallejo-Medina, & Sierra, 2009).

Figure 1 .
Figure 1.Means of Initiation assertiveness for men and women.

Figure 2 .
Figure 2. Means of No Shyness/ Refusal assertiveness for men and women.

Table 1 .
Educational level, religion, and religious practice of both men and women.

Table 2 .
Goodness-of-fit indices for measurement invariance models.

Table 3 .
Differential item functioning of the Initiation and No Shyness/Refusal subscales.

Table 4 .
Standard scores of the Initiation subscale.

Table 5 .
Standard scores of the No Shyness/Refusal subscale.