tinbanner.gif (6408 bytes)


Your free e-store?

Internet-EDI? ICEshop! (Dutch)
Knowledge4Free: gratis kennis!


 
Chapter 5
Testing Perceptions and Financial Viability

5.1. Introduction

This chapter describes a number of expert assessments, and Cost Benefit Analyses (CBAs) to answer the subquestion, what support can be found for hypotheses about the viability of Multimedia Retrieval Systems (MRSs) for Marketing & Sales (M&S)?

This chapter has two parts in which the five hypotheses, described in the previous chapter, are tested.

The first part contains expert assessments of the value added of MM and the relevance of success/risk factors, the viability of seven clusters of MM teleservices, and the viability of a Multimedia Business Catalogue (MBC) and a promotional CD-i. Thereafter, the results of a survey of potential investors in MM teleshop services are given. Next, marketing research is reviewed to assess the economic viability of MRSs for M&S.

The second part contains a retrospective CBA and ROI computation of the IECT photo archive, making a case for the effectiveness and economic viability of an MCA, and a prospective CBA and ROI computation of an MBC for tele-ordering by Top 1000 accounts, giving an indication of the effectiveness and economic viability of an MBC for tele-ordering.

Finally, the findings are summarised.

5.2. Test measurement, test reliability and test validity

Since the first part of this chapter is about testing statistical hypotheses using questionnaires for surveys and expert assessments, it is important to look deeper into the issue of test measurement, test reliability and test validity. A test is here used as an instrument for obtaining a sample of opinions about MM, retrieval, success/risk factors, system effectiveness and system viability.

Test measurements take place at the:

    • interval level using a five point Likert-type scale (Mitchell & Jolley, 1992);

    • nominal level using a dichotomous 'yes/no' scale (in a few cases only).

On a Likert-type scale, subjects usually respond to a statement by checking either "strongly agree" (scored a "5"), "agree" (scored a "4"), "undecided" (scored a "3"), "disagree" (scored a "2") or "strongly disagree" (scored a "1"). An example is given below.

 

Scarcity of multidisciplinary expertise is a typical risk factor for multimedia projects.

Strongly disagree 1----------2----------3----------4----------5 Strongly agree

 

Depending on the question, variations are used, by offering Likert-type scales ranging from "very positive" to "very negative" etc. Traditionally, psychologists have assumed that Likert-type scales yield interval data, meaning that there is an equal psychological interval between each consecutive number. The advantage of Likert-type scales is that they yield more information than nominal-dichotomous items. More important, Likert-type items can be analysed by more powerful statistical tests, such as the t-test and ANOVA, than nominal-dichotomous items (Mitchell & Jolley, 1992). Another possibility of Likert-type scales is that answers to items measuring the same variables can be summated. An important advantages of this is that tests with summated scores are more reliable than one-question tests.

For these reasons, nominal-dichotomous items are used only sparingly in the expert assessment tests. If they are used the reason is that answers on the item are not intended to test a statistic hypothesis, or the use of interval scales looks too artificial.

The significance levels used are  =.05 (p values of less than .05 are significant) and =.01 (p values of less than .01 are highly significant). Non significant (NS) p values are not presented.

 

The known methodological weaknesses of psychological tests are the threats to validity, "does the test measure what it purports to measure?", and reliability, "are the test results consistent?". Experts were specifically selected in several cases, because it is believed that they hold reasonable, consistent, opinions, and that they are able to make better informed judgements than a layman.

 

Reliability

Reliability, i.e., the repeatability of any measurement of a variable, is extremely important. Reliability is the consistency of test results, including the tendency of a test or measurement to produce the same results when it measures twice some entity or attribute, believed not to have changed in the interval between the measurements (Kidder, 1981).

Test-retest reliability is a convenient 'interpretation' of test reliability (Kidder, 1981). Yet, there are two major problems with test-retest reliability estimates (Allen & Yen, 1979):

    Carry-over or learning effects: the first testing may influence the second testing.

    Time interval effects: long time intervals make effects due to changes in information or moods more likely. New market information or decisions on project budgets may influence the perceptions of the subjects between tests. As a consequence test reliability tends to be underestimated.

To estimate the test reliability another approach is also used: split-half test reliability. The test groups are split (odd/even) into two halves, and the correlation between the two halves is determined. The outcomes tend to be somewhat lower than test-retest reliability estimates of the whole test due to the smaller subgroups. To overcome this problem the Spearman-Brown Formula is used if the halves of the test are parallel (Allen & Yen, 1979).

If item scores need to be summated it is useful to determine Cronbach's coefficient a (Allen & Yen, 1979) for item homogeneity.

These three ways of estimating test reliability are used for the tests described in following sections.

Validity

Test validity is related to the question: do you measure what you want to measure? In this case: can we measure viability and related factors? It can be argued that sometimes an attempt is made to measure the unmeasurable, to predict future viability.

When performing expert assessment test one should be well aware of some validity threats.

    Test leader effects (Kidder, 1981): the biases of a test leader influences the experts, i.e., lead to subject biases (Mitchell & Jolley, 1992), thus influencing the outcome in the expected direction. For example, a test leader who shows little enthusiasm will probably get less positive responses about an MRS than a more enthusiastic test leader.

    Demonstration system effects: in the case where a demonstration is given, the quality of the demonstration influences the responses. For example, a very good demonstration of an MRS will lead to a different perception of its viability than a bad demonstration.

    Order effects (Kidder, 1981): the order of the stimuli presented to the expert panel may lead to unwanted interactions.

An attempt was made to reduce test leader effects by standardising the testing procedures and demonstration system effects by giving only 'realistic' demonstrations. Order effects are difficult to avoid completely in business settings.

Therefore, it is important to assess the validity of the tests, apart from more qualitative approaches with regard to content validity (e.g., face validity and logical validity) (Allen & Yen, 1979). Predictive validity (Allen & Yen, 1979) is of interest, as are external validity (Mitchell & Jolley, 1992), and concurrent validity (Allen & Yen, 1979). Let us have a closer look at the possibility of obtaining such viability estimates.

The predictive validity of the tests used can only be measured by a longitudinal study: we must be able to wait 5 or 10 years and then look back and evaluate whether the MRS for M&S were as viable as expected. Such a longitudinal study is well beyond the scope of this research, and moreover, after 5 or 10 years the results of a longitudinal research design, however interesting from a methodological point of view, will no longer hold much interest for decision makers as new types of system will have evolved, making the results obsolete.

Another interesting issue is external validity, i.e., to what degree can test results be generalised to other settings, subjects and times? In my research generalisation to other times is not very relevant, because it is assumed that the viability of MRSs for M&S varies with time; they have a certain life cycle. Generalisation to other settings and subjects is, however, of interest.

The concurrent validity of a test is demonstrated by a test and criterion scores when both measurements are obtained at (about) the same time. The concurrent validity of a test to measure the viability of an MRS over time can be obtained, for example, by making a comparison with market forecasts from independent market researchers (the criterion scores).

An attempt was made to estimate concurrent and external validity for several of the test described in the following sections.

 

5.3. Perceptions of the value added of MM and success/risk factors.

5.3.1. Introduction

A survey of MM projects was performed to test hypotheses about the value added of MM, the typical risk factors for MM projects and the risk factors for the introduction of MM products and services. With regard to the risk factors, only a small subset of factors were selected which were believed to be MM specific.

 

With regard to the value added of MM the hypotheses tested are summarised in the table below. These hypotheses are related to the H1 hypothesis discussed in chapter 4. Four alternative hypotheses were formulated which apply to M&S situations in which an effective information and knowledge transfer is needed.

 
No - Alternative hypothesis
1-1 The user friendliness of an IS improves with MM.
1-2 The presentation of information improves by using audio and video.
1-3 An MM message is understood better than a textual message.
1-4 Service to customers improves by using MSs.
 

    Table 9. Overview of alternative hypotheses with regard to
    the value added of MM.

     

A general hypotheses is formulated about identified project management and system success/risk factors being critical for the viability of MRSs for M&S (H4) in the previous chapter.

With regard to the risk factors for MM projects four testable hypotheses were formulated which seem to be typical for MM projects: scarcity of multidisciplinary expertise (4-6), high production costs (4-7), complexity (4-5) and too little standardisation (4-8).

 
 
No
- Alternative hypothesis
4-5
Complexity is a typical risk factor for MM projects.
4-6
Scarcity of multidisciplinary expertise is a typical risk factor for MM projects.
4-7
High production costs for audio-visual information is a typical risk factor for MM projects.
4-8
Too little standardisation of MM products is a typical risk factor for MM projects.
 

    Table 10. Overview of alternative hypotheses with regard to
    MM project management risk factors.

     

The hypotheses about risk factors for the introduction of MM products and services are also related to the general success/risk factors hypothesis H4 discussed in the previous chapter. The hypotheses about costs (4-9, 4-10 and 4-11) are related to the category of system success/risk factors 'system and usage costs'. Unstable standard (4-13) and becoming outdated quickly (4-14) are related to the category 'technical reliability'. Too little dissemination of use is added because it seems to be an inhibiting factor in a market which is still in its infancy. For example, if only a few people use certain MSs than the market for certain MR products and services is very small. It is very difficult to reach an acceptable ROI in such markets as a supplier of such products.

 
 
No - Alternative hypothesis
4-9 High costs of hardware and software are a risk factor for the introduction of MM products and services.
4-10 High costs of using information services are a risk factor for the introduction of MM products and services.
4-11 High costs of telecommunication are a risk factor for the introduction of MM products and services.
4-12 Too little dissemination of use is a risk factor for the introduction of MM products and services.
4-13 Unstable standards are a risk factor for the introduction of MM products and services.
4-14 Products and services becoming outdated quickly are a risk factor for the introduction of MM products and services.
 

    Table 11. Overview of alternative hypotheses with regard to
    the introduction of MM products and services

     

5.3.2. Method

Perceptions or opinions about the value added and risk factors were measured.

An n=20 survey was performed from August to December 1993 with respondents from MM projects, mostly project leaders. Half of the respondents had participated in commercial (n=9) and half in research projects (n=11).

 

To test the hypotheses the experts were confronted with a number of statements about the value added of MM, and typical risk factors for MM projects and for the introduction of MM products and services (see appendix C.3.). Likert-type scales were used for the estimations that ran from 1 (Strongly disagree) to 5 (Strongly agree). The middle value 3 is 'neutral'.

An example of a question with regard to a statements is:

 

1.1. The user friendliness of an information system improves with MM

Strongly disagree 1----------2----------3----------4----------5 Strongly agree

 

An example of a question with regard to a risk factor is:

 

2. The following risk factors are typical for multimedia projects:

2.1. Complexity

Strongly disagree 1----------2----------3----------4----------5 Strongly agree

 
 

       
5.3.3. Results

A summary is given of t-test results in table 12 with regard to the hypotheses Hx-1 to Hx-14. In general, all means were in the direction (>3) as hypothesised, although not always at a confidence level of a =.05.

With regard to the (perceived) value added of MM all the null hypotheses can be rejected. The null hypotheses H1-10 that the user friendliness of an IS does not improve by MM (p<.01, t=3.53, df=19), H1-20 that the presentation of information does not improve by usage of audio and video (p<.01, t=4.05, df=19), H1-30 that an MM message is not understood better than a textual message (p<.05, t=2.36, df=19), and H1-40 that the service to customers does not improve by usage of MSs (p<.01, t=5.57, df=19) can be rejected in favour of the respective alternative hypotheses. MM is perceived to have value added.
 
Statement
2
t
p<
H1-1
The user friendliness of an IS improves with MM
3.9
1.35
1.17
3.53
.01
H1-2
Presentation of information improves by adding audio and video
4.0
1.09
1.05
4.05
.01
H1-3
An MM message is better understood than a textual message
3.7
1.58
1.28
2.36
.05
H1-4
The service to customers improves by using MSs
4.1
0.76
0.86
5.57
.01
H4-5
Complexity is a risk factor
3.9
1.33
1.17
3.45
.01
H4-6
Scarcity of multidisciplinary expertise
4.0
1.29
1.15
3.71
.01
H4-7
High production costs for audio-visual information
3.4
1.57
1.28
1.44
NS
H4-8
Too little standardisation of MM products
3.7
1.16
1.09
2.96
.01
H4-9
Costs of hardware and software
3.5
1.14
1.09
1.83
.05
H4-10
Costs of using information services
3.2
0.69
0.86
1.10
NS
H4-11
Costs of telecommunication
3.6
1.51
1.26
1.97
.05
H4-12
Too little dissemination of the use
3.5
1.17
1.10
1.81
.05
H4-13
Unstable standards
3.1
1.39
1.21
0.21
NS
H4-14
Fast outdating of products and services
3.1
1.25
1.15
0.41
NS

Table 12. Overview of reactions to statements about MM by MM project members

 

The experts agreed (» 4) with the statements that the user friendliness of an IS improves by MM (=3.9), the presentation of information improves when using audio and video (=4.0), service to customers improves by using MSs (=4.1). They tended to agree with the statement that an MM message is better understood than a textual one (=3.7), although it was remarked that this depends on the type of message and the type of audience.

 

Figure 37. Bar chart showing mean reactions to statements for the value added of MM

 

With regard to the (perceived) risk factors for MM projects three out of four null hypotheses can be rejected in favour of the alternative hypotheses. As expected, the null hypotheses H4-50 that complexity is not a typical risk factor for MM projects (p<.01, t=3.45, df=19), H4-60 that scarcity of multidisciplinary expertise is not a typical risk factor for MM projects (p<.01, t=3.71, df=19), and H4-80 that too little standardisation of MM products is not a typical risk factor for MM projects (p<.01, t=5.57, df=19) can be rejected in favour of the respective alternative hypotheses. Only the null hypothesis H4-70 that high production costs of audio-visual information is not a typical risk factor for MM projects (p=NS, t=1.44, df=19) cannot be rejected, although the outcome is in the expected direction (=3.4). High variance (1.57) can be noted with regard to the high production costs, indicating that there was little consensus about to what degree it is a risk factor.

The experts agreed (» 4) that complexity (=3.9), scarcity of multidisciplinary expertise (=4.0), and tended too agree that too little standardisation of MM products (=3.7), are risk factors for MM projects.

 

Figure 38. Bar chart showing mean reactions for typical risk factors for MM projects

 

With regard to the (perceived) risk factors for the introduction of MM products and services three out of six null hypotheses can be rejected in favour of the alternative hypotheses. The null hypotheses H4-90 that the costs of hardware and software are not a risk factor for the introduction of MM products and services (p<.05, t=1.83, df=18), H4-110 that the costs of telecommunication are not a risk factor for the introduction of MM products and services (p<.05, t=1.97, df=18), and H4-120 that too little dissemination of the use is not a risk factor for the introduction of MM products and services (p<.05, t=1.81, df=18) can be rejected in favour of the respective alternative hypotheses as expected. The null hypotheses H4-100 that the costs of using information services are not a risk factor for the introduction of MM products and services (p=NS, t=1.10, df=18), H4-130 that unstable standards are not a risk factor for the introduction of MM products and services (p=NS, t=0.21, df=17), H4-140 that fast outdating of products and services is not a risk factor for the introduction of MM products and services (p=NS, t=0.41, df=18), cannot be rejected, since the outcomes are only in the direction expected (respectively =3.2, =3.1 and =3.1).

The experts tended to agree (» 3.5) that the costs of hardware and software (=3.5), the costs of telecommunication (=3.6), and too little dissemination of use (=3.5) are risk factors for the introduction of MM products and services.

 

Figure 39. Bar chart with mean reactions with regard to typical risk factors for the introduction of MM products and services.

 

Reliability

The reliability coefficient of the test was estimated using the Spearman-Brown Formula. The reliability estimate of 0.73 is not very high. This implies that we should be careful when interpreting the test outcomes.

 

Split half correlation coefficient
0.57
Spearman-Brown coefficient
0.73
Table 13. Computation of test reliability estimates using
the Spearman-Brown Formula.

 

It is assumed that the average test item outcomes for research respondents (=3.7) and non-research respondents (=3.6) stem from the same population, that the test means are equal. A two-sample t-test shows that this assumption need not to be rejected (p=.61, t=-0.52, df=14) indeed. (On item level: 12 of 14 items did not show significant differences between both groups; see appendix C.3.).

5.3.4. Discussion

My expectations about the value added of MM were confirmed by the respondents, meaning that it is also their opinion that, in general, MM has value added (H1): MM improves the user friendliness of an IS, audio and video improves the presentation of information, an MM message is better understood than a textual one, and service to customers improves with the use of MM. These opinions are probably only valid to a certain degree: we can validate such opinions by usability testing and by effectiveness measurements.

My expectation that complexity, scarcity of multidisciplinary expertise, and too little standardisation of MM projects are typical risk factors for MM projects were confirmed by the respondents (H4). Less consensus exists about the high production costs of audio-visual information, some respondents agreed and some disagreed that this is a typical risk factor for MM projects. An argument for this is that the costs of audio-visual information production are relatively easy to assess, and that they form only a limited part of the total development costs. Moreover, in many MM projects no new audio-visual information is produced, already available audio-visual information is re-used.

For the introduction of MM products and services the respondents believe, as expected, that the costs of hardware and software, the costs of telecommunication and too little dissemination of the use are risk factors.

These latter risk factors are not only typical for MM products and services.

It is interesting to note that too little standardisation is seen as a risk factor for MM projects, but that unstable standards are seen as 'neutral' for the introduction of MM products and services. Perhaps this is because standards are always unstable, evolving (see chapter 2), and that it depends on other qualities of products and services if they are successful or not (e.g., the price/performance ratio).

The validity of risk factors can be further analysed by comparing successful and unsuccessful projects, and products and service introductions.

 

5.4. Expert assessment of the viability of MM teleservices

5.4.1. Introduction

A market survey on tele-applications resulted in a clustering of MM teleservices which are believed to be viable. These clusters included a teleshopping/telemarketing cluster (Peeters & Koenen, 1993). These clustered MM teleservices may form applications in a future Virtual Market (VM). They include several of the MRSs (see chapter 3) with telecommunication extensions. For example, an extended TSA, MBC or MPS may belong to the cluster teleshopping/telemarketing. Although the teleshop/telemarketing cluster is particularly relevant to this research, the other clusters are relevant as well. An on-line accessible MCA belongs to the electronic publishing/information retrieval cluster, while several of the MRSs, e.g., the MDA or MAI, belong clearly to the cluster office/process automation.

 

The question is, based on the viability hypothesis H3: do the recognised MM clusters contain viable telecommunication applications in the short term (0-2 years), in the medium term (2-5 years) and in the long term (5-10 years)? Since our particular attention is focused on M&S, our focus must be on the judgements for teleshopping and telemarketing.

The alternative hypotheses can be formulated, in natural language, as follows:

 

    H3-11: The MM clusters are important for the telecommunication business.

     

      H3-21: MM teleshopping/telemarketing is important for the telecommunication business.

       

It can also be hypothesised that there is a strong positive relationship between the importance of an MM cluster, or MM teleshopping/telemarketing in particular, and time:

 

      H3-31: The importance of MM clusters for the telecommunication business increases with time.

       

      H3-41: The importance of MM teleshopping/telemarketing for the telecommunication business increases with time.

5.4.2. Method

In December 1993 a meeting with 18 MM experts of PTT Research, most project leaders, was organised as part of the PTT Research MIPS project. At the beginning and at the end of the meeting all the MM experts were asked to fill in a questionnaire (see appendix C.1.).

 

To test the hypotheses the experts are asked for their importance estimates for an MM cluster in the short term (0-2 years), in the medium term (2-5 years) and in the long term (>5 years). Interval scales are used for the estimations that run from 1 (very unimportant) to 5 (very important). The middle value 3 is 'neutral'. An example of a question is given below:

 

    2. How important is the multimedia cluster teleshopping/telemarketing for the business of PTT Telecom?
0-2 years:

Very unimportant 1----------2----------3----------4----------5 Very important

 

2-5 years:

Very unimportant 1----------2----------3----------4----------5 Very important

 

>5 years:

Very unimportant 1----------2----------3----------4----------5 Very important

 

The use of this scale results in the following general format of the null hypothesis and alternative hypothesis with regard to the hypotheses H3-1 to H3-4:

 

It was expected that the expert estimates would be highly correlated with time in the sense that the mean estimates for the long term (>5 years) would be higher than the mean estimates for the medium term (2-5 years), and that in their turn the mean estimates for the medium term would be higher than those for the short term (0-2 years). Thus, the general format for the H3-3 and H3-4 null hypotheses and the alternative hypotheses is:

5.4.3. Results

As can be seen in summary table 14, no MM cluster was seen as important for the telecommunication business on the short term (0-2 years), at a significance level of a =.05. In the medium term (2-5 years) almost all MM clusters estimates were significantly above 'neutral', with the exception of estimates for Security and electronic

 

Figure 40. Assessment of importance of MM clusters for a telecommunication company

 

publishing/information retrieval. In the long term (>5 years) all clusters were seen as important, and all these results are significant or highly significant. When taking the cluster means the same significance pattern as seen for the medium term, is repeated.

 

The experts saw the teleshopping/telemarketing cluster as important for the telecommunication business, but this estimation is only significant on the middle (p<.01, t=2.82, df=17) and the long term (p<.01, t=5.36, df=17). Thus, H3-20 that the clusters are not viable can be rejected for the teleshopping/telemarketing cluster.

 
 
Test 1
0<2 years
2-5 years
>5 years
Cluster
t
p<
t
p<
t
p<
t
p<
AVT
3.3
1.32
0.89
NS
4.1
0.91
5.04
.01
4.6
0.70
9.80
.01
4.0
0.85
4.93
.01
Teleshopping/telemarketing
2.9
1.09
-.44
NS
3.6
0.88
2.82
.01
4.0
0.81
5.36
.01
3.5
0.73
2.92
.01
Infotainment
3.1
1.26
0.19
NS
3.7
1.03
2.75
.01
4.3
0.96
5.66
.01
3.7
0.95
2.98
.01
Security
3.1
1.40
0.25
NS
3.3
0.91
1.16
NS
3.4
0.85
1.94
.05
3.2
1.00
1.03
NS
Store & forward services
2.6
0.92
-1.8
NS
3.6
0.85
3.05
.01
4.2
0.81
6.41
.01
3.5
0.73
2.78
.01
Electronic publishing/IR
2.6
1.04
-1.6
NS
3.1
0.76
0.62
NS
3.5
0.99
2.15
.05
3.1
0.80
0.39
NS
Office/process automation
3.2
1.20
0.59
NS
3.7
1.19
2.38
.05
3.9
0.94
4.27
.01
3.6
2.38
2.38
.05

Table 14. Overview of importance estimates of an MM cluster for the business of a telecommunication company as assessed by MM
telecommunication experts (n=18).

 

To test the hypothesis (H3-10) that the experts are not positive about the MM telecommunication clusters in general, a one-sample t-statistic was performed on the expert means (see appendix C.1.). The result is that the H3-10 hypothesis can be rejected: the mean expert estimate (3.51) is significantly (p<.01, df=17, t=3.42) above 'neutral' (3).

 

The H3-3 and H3-4 alternative hypotheses that there is a positive relationship between time and the level of importance are formulated above. An ANOVA was used to test the H3-3 null hypothesis for mean item outcomes and to test the H3-4 null hypothesis for the teleshopping/telemarketing outcomes. The outcomes of both ANOVA tests are highly significant (p<.01, see table 15), meaning that the null hypothesis can be rejected. Given that no inverse relationship between 'importance' and 'time' is found, the alternative hypotheses H3-31 and H3-41 are supported. The expert expectations for the MM clusters, and for MM teleshopping/telemarketing in particular, increase significantly with time.

 

df
F
p<
T1 Item means
2
16.07
.01
T1 Teleshopping/Telemarketing
2
6.78
.01
Table 15. Test 1 ANOVA outcomes for item means and
teleshopping/telemarketing in particular.

 

Reliability

The test-retest reliability is a convenient 'interpretation' of test reliability (Kidder, 1981). Thus, the same subjects were confronted a second time with the same test on the same day (about 4 hours later). The results are summarised in table 16 below.

 
Test 2 means
0<2 years
2-5 years
>5 years
Cluster
AVT
3.4
4.1
4.6
4.0
Teleshopping/ telemarketing
2.2
2.8
3.2
2.7
Infotainment
2.8
3.8
4.3
3.6
Security
2.6
2.9
3.0
2.8
MM ftore & forward services
2.5
3.1
3.5
3.0
Electronic publishing/IR
2.3
2.9
3.2
2.8
Office/process automation
3.1
3.6
4.0
3.6

Table 16. Results of the second test. Mean importance estimates of an MM cluster for the business of a telecommunication company as assessed by
MM telecommunication experts (n=18).

 

Figure 41. Scatter diagram showing the test-retest scores for the test items.

 

The correlation coefficient between test and retest outcomes is 0.89 (see figure 41 and the summary table below). This high correlation gives confidence in the reliability of the assessment method. There are two relevant test-retest reliability problems (see section 5.2.).

    • A carry-over or learning effect might have been present because the time between the test was short (about 4 hours): the subjects could still remember the answers they gave the first time. The effects of this are rather unpredictable: it may lead to repeating answers, or it may lead to more differentiated answers.

    • Time effects were reduced by keeping the time interval short. Nevertheless, a time interval effect did occur as a discussion of MM matters took place in between tests!

     

To estimate the test reliability another approach was also used: split-half test reliability using the Spearman-Brown Formula (see section 5.2.). The results are given both for the first and the second test in the table below.

The conclusion is that the reliability of the test is satisfactory high, about 0.9. The fact that the Spearman-Brown reliability estimates approach the test-retest reliability coefficient so closely confirms the idea that the Spearman-Brown estimate is a useful alternative when test-retest reliability estimates cannot be computed.

 

Test-retest reliability
0.89
Split-half test correlation coefficient
0.80
Test reliability (Spearman-Brown Formula)
0.89
Split-half retest correlation coefficient
0.83
Retest reliability (Spearman-Brown Formula)
0.91
Table 17. Test reliability estimates.

 

It can thus be assumed that the experts were consistent in their estimations. To test this assumption a t-statistic was performed on the estimates of experts per item for the first and second test. In table 18 the outcomes are shown indicating that there are no significant statistical differences (p=NS, df=34, t=1.46) between the mean outcomes of test 1 (3.5) and test 2 (3.2).

 

Statistical differences are present, however, between the mean outcomes for the first and second test with regard to cluster means (p<.05, df=6, t=2.57) and, more seriously, teleshopping/telemarketing (p<.01, df=34, t=2.94). Furthermore, the test 2 teleshopping/telemarketing outcomes even in the long term are only slightly, not significantly, above 'neutral' (3.2) (see appendix C.1.).

 
df
n
t
p<
Expert means
34
18
1.46
NS
Cluster means
6
7
2.57
.05
AVT
34
18
-0.13
NS
Teleshopping/Telemarketing
34
18
2.94
.01
Infotainment
34
18
0.21
NS
Security
34
18
1.41
NS
MM Store & Forward services
34
18
1.57
NS
Electronic Publishing
34
18
1.05
NS
Office & Process Automation
34
18
0.05
NS

Table 18. Two-tailed two-sample t-test summary table.

 

Figure 42. Comparison of mean test and mean retest scores, itemised for the MM clusters

 

Although, the test reliability is satisfactory, about 0.9, the significant lower teleshopping/telemarketing estimates in the retest, which are about 'neutral' and for which thus the H3-20 hypothesis cannot be rejected, make it difficult to interpret the results with regard to this hypothesis.

Concurrent validity

An indication of the validity of the expert ratings can be obtained by looking at the correlation of the ratings with a concurrent criterion. In this case, market growth estimates from OVUM (Jeffcoate et al., 1993) were used as the concurrent criterion. The outcomes are:

    • a correlation coefficient of 0.99 between the expert AVT estimates and the OVUM estimates for revenues from videoconferencing traffic in Europe and the USA for telecommunication companies; OVUM estimates these revenues will grow from $ 446 million in 1992 to $ 2,223 in 2000;

    • a correlation coefficient of 0.89 between the mean scores for the short/middle/long term, (2.96, 3.57 and 4 respectively, and the OVUM estimates for revenues from MM PC traffic for telecommunication companies; OVUM estimates these revenues will grow from $ 1 million in 1992 to $ 1,295 million in 2000.

     

Figure 43. Comparison of highly correlated, but significantly different, forecasts for the videoconferencing market.
 

We should be careful with interpreting these correlation coefficients, because market forecasts have a limited reliability and validity. For example, the comparable OVUM and YankeeGroup (1992) estimates for the total videoconferencing market from 1992-1996 are highly correlated (0.92), since they both predict ongoing growth, but at a very different growth rate and therefore the forecasts of The YankeeGroup are significantly (p<0.05; t=2.36, df=4) more optimistic than the outcomes of OVUM (see figure 43). Which one is valid?

Nevertheless, we can conclude from the high concurrent validity coefficients that the expert judgements about the growing importance of MM for telecommunication are in agreement with market forecasts.

5.4.4. Discussion

The null hypothesis that MM clusters are not important for the telecommunication business can be rejected for some clusters and for all clusters in the long term (>5 years). The hypothesis that MM teleshopping/telemarketing is not important for the telecommunication business can also be rejected for the mean and long term. The retest results, however, throws a shadow over this latter result, probably due to the effects of the discussion about MM clusters that took place in-between the test and retest.

Yet, the hypothesis that the importance of MM clusters, and MM teleshopping/telemarketing in particular, for the telecommunication business does not increase with time can be rejected in favour of the alternative hypothesis. These expectations correspond with market forecasts and widespread expectations within the telecommunications industry.

Thus, one can conclude that it is expected that the MM clusters, among which MM teleshopping/telemarketing, are becoming important from a telecommunication company's point of view, but that there is uncertainty about at what speed this is going to happen and at what level of profitability.

 

5.5. Expert evaluation of an MBC and the value added of MR

5.5.1. Introduction

As is shown by an evaluative study (Hoogeveen, 1993c) the paper General Specification pages catalogue of PTT Telecom, containing a two pages description of every product and service for the business market, has a number of quality problems: too low topicality (63%-90%) and too low completeness (58%). An MBC would solve at least one of these problems: low topicality.

It is then necessary to ask, will an MBC offering MR facilities be effective, and what MM elements and retrieval facilities have value added?

On the bases of H5, about the (perceived) effectiveness of MRSs for M&S, it is hypothesised that:

 

H5-11: The MBC is judged to be better than a paper catalogue.

 

On the basis of H1, about the value added of MM, and H2, about the value added of retrieval, it is hypothesised that added MM elements and added retrieval facilities will be judged positively in case of an MBC.

A number of information types were selected to be judged for the MBC: video, speech, and colour pictures. A number of retrieval facilities were also selected for testing: hyperlinking, search fields, graphical browsing (i.e., backtracking with a history function), full text searches, hierarchical indexing using menu structures, and browsing through reducible sets using product names. The MM elements chosen are believed to offer value added when showing product and services information. The selected retrieval facilities are the simpler ones, that do not need much explanation for a relatively inexperienced computer user, as are most sales people.

 

The general forms of the alternative hypotheses about these elements and facilities are:

H11: Offering the MM element is judged positively by M&S experts.

 

H21: Offering the retrieval facility is judged positively by M&S experts.

 

The idea is that these judgements on the effectiveness of an MBC and value added of MM and retrieval give indications of the viability of an MBC (H3).

 

5.5.2. Method

Between August 1993 and December 1994 an MBC demonstrator based on the paper General Specification pages catalogue was developed as part of the PROMISE project within PTT Research (Derksen, 1994). The MBC demonstrator contains two main modules: a module for compiling a tailor-made catalogue, and a presentation module to access the contents of the MBC.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

foto 940930307 "diskfax"

 

 

 

 

 

 

 

Figure 44. Searching and presenting a product using the MBC demonstrator.

 

The demonstrator includes all the MM elements and retrieval facilities mentioned above. Video is offered in a small window at the start of the 'specification pages' of a product or service. Corporate TV commercials were digitised for this purpose. Speech is offered as part of the video clips, and is also offered to read aloud the text of the product or service descriptions. A colour picture is included for every product which can be blown up to screen size if the picture is clicked on. Since the presentation module contains the complete selection of MM elements and retrieval facilities most of the presentation time was dedicated to showing the presentation module.

 

An M&S expert panel was formed, consisting of PTT Telecom M&S staff people to evaluate the MBC demo and test the hypotheses. 18 people were invited for the expert panel, but only 15 people were actual available for an evaluation session.

The session consisted of the presentation of the MBC demonstrator, a short period of questions and answers led by the presenter, and the filling in of a questionnaire (see appendix C.4.) by the M&S experts.

 

To test the main hypothesis about the value added of an MBC an interval scale was used that runs from 1 (much worse) to 5 (much better). The middle value 3 is 'neutral'. An example of a question is:

 

    1. Do you think that a similar [as demonstrated] multimedia catalogue is better or worse than a paper catalogue for sales support?

    Much worse 1----------2----------3----------4----------5 Much better

     

To test the hypotheses about the MM elements and retrieval facilities interval scales were used that run from 1 (very negative) to 5 (very positive). The middle value 3 is 'neutral'. An example of a question is:

 

    2. How do you judge the addition of video to the catalogue?

    Very negative 1----------2----------3----------4----------5 Very positive

     

The use of these scales results in the following general format for the null hypothesis and alternative hypothesis with regard to the hypotheses formulated in the foregoing section:

 

The relevant validity threats are test leader effects and demonstration system effects (see section 5.2.).

5.5.3. Results

An MBC was judged to be better (=3.9) than the paper catalogue. Thus, the null hypothesis that an MBC is not judged to be better than a paper catalogue can be rejected (p<.01, t=5.77, df=13) (see table 19).

Figure 45. Bar chart with mean expert judgements

 

All the judgements on MM elements and retrieval facilities are in the expected direction (>3). Since the test results (see table 19) are significant at a confidence level of .05 (speech, full text) or .01, all the others, the null hypotheses for these elements and retrieval facilities can be rejected in that their addition to an MBC was not judged positively.

 
 
H
Item
2
t
p<
5-1
MM catalogue is better than paper catalogue 3.9 0.42 0.65 5.77 .01
1-2
Judgement on the addition of video 3.8 0.72 0.85 3.75 .01
1-3
Judgement on the addition of speech 3.5 1.09 1.05 2.17 .05
1-4
Judgement on the addition of colour photographs 4.5 0.48 0.69 9.41 .01
2-5
Judgement on hyperlinking support  4.0 0.85 0.92 4.42 .01
2-6
Judgement on supporting key word searches 3.6 0.58 0.76 3.22 .01
2-7
Judgement on the possibility of having a history function 3.9 0.44 0.66 5.49 .01
2-8
Judgement on the possibility of using full text searches 3.6 0.78 0.88 2.92 .05
2-9
Judgement on the possibility of searching by menus 3.7 0.84 0.91 3.32 .01
2-10
Judgement on the possibility of searching by browsing 3.6 0.52 0.72 3.80 .01

Table 19. Summary one-sample t-test results for the MBC demonstrator
 

The use of colour pictures especially was judged 'positive' to 'very positive' (=4.5). The use of video was judged 'positive' (=3.8), probably because of imperfect projection by a transview on an overhead projector of moving pictures. Speech was judged 'neutral' to 'positive' (=3.5): reading text aloud was not seen as a really valuable addition.

All search facilities were judged 'positive' (3.5<<4.5), with hyperlinking and graphical browsing in the form of a history function as positive extremes.

Reliability

The outcomes of the test reliability computation for all items using the Spearman-Brown Formula, is not really convincing, a reliability of about .5, due to a low correlation for retrieval items. The test reliability is more satisfactory (0.78) for the MM items alone.

 

Split-half correlation
coefficient
Spearman-Brown
Formula
All items
0.34
0.50
MM items only
0.64
0.78
retrieval items only
0.21
0.34
Table 20. Reliability estimates using the Spearman-Brown Formula

5.5.4. Discussion

The positive judgements on the MBC as demonstrated by the MBC demonstrator and its elements and facilities are encouraging, but it does not mean that every paper catalogue should be replaced by an MM catalogue and that all MM elements and retrieval facilities judged positively are always needed. Remarks made by the respondents indicated that an MBC should contain additional (not demonstrated) functionality for customer profiles and customer history, print functionality, references to happy client situations to demonstrate the use of products and services, and accessing related information systems.

If the positive expert judgement about the MBC in comparison to a paper catalogue is a valid and representative measure, than a positive statement about the viability of MBCs in general seems justified. Yet, a generic judgement like this does not say much about the viability of an MBC for a specific business situation.

 

5.6. Expert evaluation of a promotional CD-i and the value added of MR

5.6.1. Introduction

Hypotheses, that are comparable to those tested with the MBC demo, were tested using a demo on a CD-i. There is one main difference: the CD-i was positioned as a promotional system that can be consulted occasionally by a wide range of potential customers, whilst the MBC will be used by business sales staff and business customers.

Nevertheless it is interesting to make comparisons between the outcomes of both evaluations (see section 5.7.).

 

The main alternative hypothesis, based on hypothesis H5 about system effectiveness, is:

 

H5-11: The CD-i is seen as a useful medium for M&S by M&S experts

 

Note that 'useful' is used as interchangeable with 'effective'. Further it is assumed that a medium that proves to be effective for M&S in terms of meeting business objectives is also viable.

 

On the basis of H1, about the value added of MM, a number of MM elements, information types, were selected to be tested on the CD-i: video, music, pictures, and speech. On the basis of H2, about the value added of retrieval, a number of retrieval facilities were also selected to be tested: hyperlinking, hierarchical indexing in the form of menu structures, browsing and search fields. The MM elements are believed to be of great value for marketing product information. The retrieval facilities selected are the simpler ones, that need little explanation for a relative inexperienced computer user to use.

 

The general forms of the alternative hypotheses about these elements and facilities are:

H11: Offering the MM element is judged positively by M&S experts.

H21: Offering the retrieval facility is judged positively by M&S experts.

 

The idea is that judgements on the effectiveness of a promotional CD-i and value added of MM and retrieval give indications of the viability of an MPS (H3).

5.6.2. Method

An MR demonstrator on CD-i was developed for PTT Telecom with the theme 'Mobile communication products' between November 1993 and January 1994. The actual development of this CD-i title was done in collaboration with the CD-i developer Merlin. The CD-i contains five modules: a mobile products catalogue, a slide show with these products, news in the form of TV commercials, an animation to explain the operation of a mobile telephone, a Greenhopper, and an explanation of how to produce a CD-i in the form of an interactive sheets presentation.

The CD-i title includes most of the MM elements and retrieval facilities mentioned in the foregoing section, however, browsing was only implemented to a limited extend, and search fields were not implemented at all because this was beyond the available budgets as it requires a lot of programming effort to realise this on CD-i.

 

To evaluate the CD-i demo and to test the hypotheses an M&S expert panel was formed, consisting of PTT Telecom M&S staff. 18 people were selected for the expert panel, but only 15 people were actual available for an evaluation session.

The session consisted of a presentation of the CD-i demonstrator, a question and answer session controlled by the presenter, and the filling in of a questionnaire (see appendix C.5.) by the M&S experts. The CD-i session followed on after the MBC demonstration session.

 

 

 

 

 

 

 

 

 

foto 940930302 "greenhopper"

 

 

 

 

 

 

 

 

 

 

 

 

Figure 46. Searching and presenting a product using the CD-i demonstrator.

 

For the main usefulness hypotheses (H5-1) a nominal (yes/no) scale was used. The related question in the questionnaire was formulated as follows:

 

    1. Do you think that the CD-i is a useful medium for marketing & sales? yes/no

     

Interval scales are used for the hypotheses on the MM elements and retrieval facilities. The scales used run from 1 (very negative) to 5 (very positive). The middle value 3 is 'neutral'. An example of a question is:

 

    1. How do you judge the addition of video to product presentations for customers?

    Very negative 1----------2----------3----------4----------5 Very positive

     

The use of this scale results in the following general format of the hypotheses with regard to the MM elements and retrieval facilities:

5.6.3. Results

Of the 15 M&S experts 14 judged the CD-i to be a useful medium for M&S, 1 expert did not respond. (This is a highly significant result: p<.01, c 2=14.071, df=1). So the null hypothesis that the MR possibilities of the CD-i are not seen as useful for M&S can be rejected.

 

The interesting M&S applications mentioned by the respondents, the no. of responses for each category is given between brackets, are:

    • POI/POS in shops (9), Primafoons, Business Centres, and dealer shops, for example, to support sales conversations;

    • instruction for sales people, dealers and customers (8);

    • demonstrations and presentations at fairs and seminars (6);

    • support of corporate image and PR (3);

    • catalogues and product information in general (2);

    • information transfer towards the consumer at home (1);

    • order intake, when using a Tele-CD-i (1);

    • distribution of data and visuals (1).

     

Figure 47. Mean experts judgements on CD-i MM and retrieval aspects

 

A graphic representation of the expert judgements is shown in figure 47. Since search fields and browsing were not shown on the CD-i demo the scale for these two items runs from not necessary (1) to necessary (5). For the other items, the scale runs from very negative (1) to very positive (5).

 

As can be seen from t-test summary table 21, the outcome is not significant only for browsing. As a result the null hypothesis that browsing is not seen as a necessary retrieval facility cannot be rejected. The judgement about browsing was about 'neutral' (3.3).

 

Table 21 demonstrates that all the MM elements are judged significantly above 'neutral', meaning that the respective null hypotheses that these elements are not judged positively (less than or equal to 3) can be rejected. Both the use of speech and video are judged 'positive' to 'very positive' (=4.4 and =4.5 respectively). The experts report that the combined use of video and speech holds attention, and makes the presentation persuasive ("seeing is believing").

 
 
H
t p<
1-1
Judgement on the addition of video 4.5 0.25 0.50 12.728 .01
1-2
Judgement on the addition of speech 4.4 0.52 0.72 8.067 .01
1-3
Judgement on the addition of music 3.7 0.64 0.80 3.725 .01
1-4
Judgement on the addition of colour photographs 3.9 0.55 0.74 4.947 .01
2-5
Judgement on hyperlinking support 4.0 0.17 0.41 9.873 .01
2-6
Judgement on the possibility for searching by menus 3.7 0.95 0.98 2.898 .01
2-7
Judgement on the necessity for searching by browsing 3.3 1.10 1.05 1.351 NS
2-8
Judgement on the necessity for key word searching 3.6 0.97 0.99 2.583 .05

Table 21. Summary table with t-test outcomes.

 

Colour photographs and music were judged 'positive' (=3.9 and =3.7 respectively). A remark made by respondents about the photographs was that they do not look very dynamic in comparison to video and speech on CD-i. With regard to music several experts noted that the choice of music is very complicated because the tastes of people differ so much. Music that is too obtrusive should be avoided.

 

The retrieval facility hyperlinking was judged 'positive' (=4.0), this outcome is highly significant (p<.01, t=9.837, df=14). The use of menu structures was also judged 'positive' (=3.7), this outcome was also highly significant (p<.01, t=2.898, df=14). So, the null hypotheses that hyperlinking and menu structures are not judged 'positive' (values less than or equal to 3) can be rejected. A problem reported with hierarchical menu structures for retrieval of information is that it may take a long time to find the right information.

 

As hypothesised, the experts saw the use of search fields as necessary (=3.6), and this outcome is significant (p<.05, t=2.583, df=14).

 

Reliability

What is the reliability of this expert assessment? In the previous reliability measurement case the use of the Spearman-Brown coefficient based on the split halves coefficient proved useful (see section 5.2.). The Spearman-Brown coefficient is 0.70. This is a reasonable but not high reliability coefficient. This can be explained from differences in the variance (see  in table 21): the variance of the answers for retrieval items is much larger than for MM items. As can be seen in table 22 test reliability for MM items only is very high (0.98), whilst test reliability for retrieval items only is not really convincing (0.51). In other words: there was a stronger consensus about the value added of the MM items than there was about the value added of retrieval items!

 
Split-half correlation coefficient
Spearman-Brown
coefficient
All items
0.54
0.70
MM items only
0.96
0.98
retrieval items only
0.34
0.51

Table 22. Reliability estimates using the Spearman-Brown Formula.

 

Generalisability (external validity)

In section 5.8. the results of demonstrating some of the same MM items in combination with the same CD-i demonstrator to respondents outside KPN, but within the Netherlands, are presented. Most of these respondents have an M&S related function. The corresponding items are related to judgements on the inclusion of video, speech, music and colour photographs in the system. When the results on these four items are compared, a correlation coefficient of 0.36 is found, however, when the colour photograph item is omitted a perfect correlation of 1.0 is found! With regard to the use of colour photographs both groups of respondents judged significantly differently (p<.05, t=-2.20, df=32). This is probably due to the fact that the CD-i was presented to the non-KPN respondents to show a teleshop user interface and not to show other M&S possibilities.

5.6.4. Discussion

The null hypothesis that the promotional CD-i, on the basis of the given M&S demonstration, is not seen as useful for M&S by M&S experts is rejected. What does this say about the effectiveness of promotional CD-i's? If the assumption is true that the experts give a valid and reliable judgement, and effectiveness can be equated with 'useful', then a positive answer can be given. In reality, however, there is a long way to go for the CD-i medium. Its penetration is not high enough to justify the already large investments in CD-i marketing applications for consumers. The Tele-CD-i is not yet on the market.

 

The experts judgements on the value added of MM elements (H1) and retrieval facilities (H2) were as hypothesised, although the browsing facility was not judged significantly above 'neutral'. It seems reasonable to assume that the facilities judged positively, especially video and speech, contribute to the positive judgement about the CD-i as a whole as M&S medium for use by customers. Since the variance for the retrieval items is large and test reliability for these retrieval items is only moderate, the test results with regard to the retrieval items should be handled with some caution. The only exception is hyperlinking.

One possibility to improve the reliability of the test part related to the retrieval items is to increase the sample, this would probably result in more equilibrated sample means for the retrieval items.

If the variance in the answers related to three out of four retrieval items was caused by differences in the level of experience with retrieval facilities another possibility would be to include all four retrieval facilities, also browsing lists and search fields, in the CD-i demo and to confront the respondents more directly with these facilities by letting them play with the CD-i demo themselves, and by performing a usability test with potential users in a usability lab.

 

5.7. Comparing judgements with regard to the promotional CD-i and the MBC

It is interesting to compare judgements with regard to the promotional CD-i and the MBC, presented in the foregoing sections. Such a comparison is of interest because the same group of experts was involved in both demonstration sessions. A validity threat we should be aware of is an order effect: the MBC session was held first followed by the CD-i session. An alternative to rule out this effect would be to use a true experimental design like a randomised two-group design, however, such designs are often not desirable or feasible in business settings.

 
df
pooled var.
t
p<
Video
4.50
3.77
28
0.46
2.95
.01
Speech
4.37
3.57
28
0.77
2.49
.05
Colour photo
3.87
4.43
28
0.58
-2.04
NS
Hyperlinking
3.96
4.00
26
0.49
-0.14
NS
Menu
3.67
3.70
28
0.87
-0.10
NS
Browsing
3.33
3.67
28
0.79
-1.03
NS
Search fields
3.60
3.64
27
0.79
-0.13
NS

Table 23. Overview t-test results with regard to judgement differences
for the promotional CD-i and the MBC.
 

It is interesting to note that there is no correlation (.01) between the mean scores in the two test conditions. A two-tailed t-test, to test the hypothesis that the mean scores of both groups on each variable are equal, revealed that only with regard to two MM elements statistical differences were found. These MM elements are the inclusion of video and the inclusion of speech. With regard to the inclusion of colour photographs clear, but not significant, differences are found.

    • Video was more appreciated in a promotional CD-i than in an MBC (t=2.95, df=28, p<.01), this result is highly significant.

    • Speech was significantly more appreciated in a promotional CD-i than in an MBC (t=2.49, df=28, p<.05).

    • Colour photographs were more appreciated in an MBC than in a promotional CD-i, but this difference is not statistically significant.

In the discussion with the expert panel members it became evident that for promotional purposes on a TV based medium (the CD-i) the use of video and speech is more appropriate than on a PC based medium used in an office environment for the support of sales people (the MBC). Photographs are more fit for the M&S office environment because they do not distract so easily. On a promotional medium, however, they are too static in comparison with video.

It is interesting to note that no clear differences between the two test conditions were found with regard to the inclusion of retrieval facilities.

 

5.8. A survey of the effectiveness of MM teleshop services

5.8.1. Introduction

The VM service is discussed in chapter 3. The viability of the VM depends largely on the viability of the individual MM teleshop services of information providers, who make use of the general VM service. A number of interviews with information providers offering Videotex information services, and potential investors in information services, were conducted to obtain an indication of the viability of MM teleshop services for information providers.

 

Based on hypothesis H1, about the value added of MM for M&S in situations where effective information and knowledge transfer is required, it is hypothesised that an MM user interface is judged to be better than a text based interface (H1-11), that the addition of MM elements is judged 'positive' (H1-21), and that MM is judged to be useful for marketing and sales by an information provider (H1-31).

With regard to hypothesis H3, about the viability of MRSs for M&S, it is hypothesised that an MM teleshop is judged to be viable (H3-41), and that this judgement will improve if a longer time horizon is taken (H3-51).

Certain success/risk factors are believed to be relevant (H4). Since the questionnaire is limited only a small selection of success/risk factors were tested (H4-61):

    • the importance of the innovative image of an MM teleshop service;

    • the importance of the inclusion of automatic payment functionality, related to flexible user support, this is considered to be vital for teleshop services;

    • competitive advantage as an argument to develop a teleshop service.

With regard to hypothesis H5, about the (perceived) effectiveness of an MRS for M&S, it is hypothesised that an MM teleshop service is perceived to be effective in terms of meeting the information provider's business objectives (H5-71). Further, it is hypothesised that this judgement will be improved if a longer time horizon is taken (H5-81).

 

Finally, it is hypothesised that differences will be found between the judgements of experts from experienced, innovative firms that already run a teleshop service, and firms that do not have any experience with setting up and running teleshop services (see demographic hypothesis Hd-91).

 
 
No - Alternative hypothesis
1-1 An MM user interface is judged to be better than a text based interface (e.g., as used in teletext, Videotex or Minitel) by information providers.
1-2 The addition of an MM element (video, speech, colour pictures, music, animations) is judged positively.
1-3 MM is judged to be useful for marketing and sales by an information provider.
3-4 An MM teleshop is judged to be viable by an information provider.
3-5 The judged longer term viability of an MM teleshop service is higher than the judged shorter term viability of such a service.
4-6 A success/risk factor (innovative image, inclusion of automatic payment, competitive advantage) is relevant for an MM teleshop service.
5-7 An MM teleshop service is perceived to be effective in terms of meeting the information provider's business objectives, i.e., extra revenues, new customers, improved margin, ROI, gaining market share, improving quality of service.
5-8 An MM teleshop service is perceived to be more effective in terms of meeting the information provider's business objectives on the long term rather than on the short term.
d-9 Judgements made by experienced and non-experienced firms differ.
 

      Table 24. Overview of alternative hypotheses with regard to the economic viability of MM teleshop services for information providers.
5.8.2. Method

A group of eleven respondents from experienced teleshop service companies was approached. Two companies did not wish to co-operate. Next, another group of ten respondents from comparable companies was approached. These companies were comparable in the sense that they operate in the same mix of branches as the experienced group of companies.

The group of respondents (n=19), approached between March and June 1994, consisted of 6 general managers, 6 marketing managers, 3 project leaders, 2 service development staff members, 1 head of information systems, and 1 general management assistant.

 

The respondents were first confronted with a demonstrator giving an impression of how an MM user interface for an MM teleshop system looks like, during an interview session. The CD-i demonstrator, described previously, was used for this purpose. After this demonstration respondents were asked questions about the MM aspects to test hypotheses H1-x0. Next, respondents were confronted with scenario 1 and questions about viability (H3-x0) and meeting business objectives (H5-x0), and so on for scenario 2 and scenario 3. Finally, they were asked some questions about success/risk factors (H4-60).

Scenario 1 contains a description of the situation in the year 1994. Scenario 2 contains a description of the situation in the year 1999, and scenario a description of the situation in the year 2004 (see appendix C.6.). Scenario's 2 and 3 are based on extrapolations of current developments, which is always hazardous.

 

An example of an interval scale used for a question about the value added of MM is given below:

 

    M1. Do you think that this kind of user interface is worse or better than a text based interface like Teletext, Videotex or Minitel?
Much worse 1----------2----------3----------4----------5 Much better

 

An example of a nominal-dichotomous question about the viability of MM teleshop services is:

 

    9. Do you think a multimedia teleshop service is viable for your company today?
0 Yes 0 No

 

An example of a question about system effectiveness (meeting business objectives) for scenario '1999' using again an interval scale is:

 

    5. Do you expect to meet the ROI requirement for a multimedia teleshop service for your company in 1999?
Certainly not 1----------2----------3----------4----------5 Certainly yes

 

The use of the interval scale results in the following general format of the H1-1, H1-2, H4-6 and H5-7 null hypotheses and alternative hypotheses:

 

It was expected that the expert estimates would be highly correlated with time in the sense that the mean estimates for the long term (2004) would be higher than the mean estimates for the medium term (1999), and that in their turn the mean estimates for the medium term would be higher than those for the short term (1994). So, the general format for the H5-8 null hypotheses and the alternative hypotheses is:

 

 

The hypothesis H1-3 and H3-4 using a nominal-dichotomous scale (yes/no) can be formulated as hypotheses testing goodness of fit (Kirk, 1978):

Where p is the proportion of the respondents scoring yes or no.

 

Hypothesis H3-5 about the positive relationship of viability expectations (measured on a yes/no scale) with time can be formulated as hypotheses about the equality of proportions (Kirk, 1978):

 

The demographic hypothesis Hd-9 with regard to different scores on interval scales can be formulated as:

 

5.8.3. Results

With regard to the value added of MM it can be concluded from the results, presented in table 26, that the H1-10 and H1-20 hypotheses can be rejected: the addition of MM elements to the user interface of a teleshop service was judged 'positive' (mean of mean score by respondents: =4.2) on all variables and an MM user interface was judged to be better than a text based interface, hence also the mean MM judgements by respondent are significantly above 'neutral' (t=9.67, p<.01, = 0.55, df=18).

 
 
H
Evaluation MM aspects
df
t
p<
1-1
M.1. MM user interface better than text based user interface
18
4.58
0.69
9.939
.01
1-2
M.2. Judging the addition of video negative/positive
18
4.21
0.98
5.404
.01
1-2
M.3. Judging the addition of speech negative/positive
18
4.16
0.90
5.618
.01
1-2
M.4. Judging the addition of colour pictures negative/positive
18
4.47
0.84
7.636
.01
1-2
M.5. Judging the addition of music negative/positive
18
3.68
0.95
3.153
.01
1-2
M.6. Judging the addition of animations negative/positive
18
4.24
0.75
7.167
.01
1
Mean score per respondent
18
4.22
0.55
9.673
.01

Table 25. Results of survey with regard to MM aspects measured on an interval scale.
 
Figure 48. Judgement on the value added of MM for teleshop services.

 
 
H
Evaluation MM aspects
df
yes
no
p<
1-3
M.7. MM is useful for M&S
1
100%
19
0
19.05
.01

Table 26. Results of survey with regard to MM aspects measured on a yes/no scale.
 

All respondents saw MM as useful for M&S (p<.01, c 2=19.05, df=1). This indications points in the same direction as the results given above.

 

With regard to scenario '1994' the null hypothesis H5-70 cannot be rejected. Respondents did not believe that today an MM teleshop service generates new revenues, improves the margin, helps to reach new customers, meets ROI requirements and helps to gain market share. Only customer service was believed to improve if an MM teleshop service is introduced today (p<.01, t=3.47, =1.16, df=18). If we consider the mean scores by respondents, and thus take into account all scores, it cannot be concluded that introducing an MM teleshop service in 1994 is judged to have value added.

 
 
H
Scenario I: 1994
<