版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領
文檔簡介
1、Regression Analysis & ANOVA,1,迴歸分析與變異數分析概論,洪弘祈朝陽科技大學工業(yè)工程與管理系副教授,Regression Analysis & ANOVA,2,參數之最小均方差估計式 誤差估計 迴歸分析之假設檢定 迴歸分析之信賴區(qū)間 預測 模式之適合度檢定,大綱,Regression Analysis & ANOVA,3,定義,RegressThe act of rea
2、soning backwardRegressionA functional relationship between two or more correlated variables that is often empirically determined from data and is used esp. to predict values of one variable when given values of the ot
3、hers.,Regression Analysis & ANOVA,4,Regression Analysis & ANOVA,5,XY散佈圖,Regression Analysis & ANOVA,6,linear model (equation) probabilistic linear model simple linear regression model regre
4、ssion coefficients,Regression Analysis & ANOVA,7,multiple regression model multiple linear regression model intercept partial regression coefficients contour plot,Regression Analysis & ANOVA,8,de
5、pendent variable or response y may be related to k independent or regressor variables interaction any regression model that is linear in parameters (the b’s) is a linear regression model, regardless of the
6、 shape of the surface that it generates.,Regression Analysis & ANOVA,9,Regression Analysis & ANOVA,10,模式參數之最小均方差估計式,簡單線性迴歸,Regression Analysis & ANOVA,11,method of least squares least squares normal equa
7、tions fitted or estimated regression line residual,Regression Analysis & ANOVA,12,Regression Analysis & ANOVA,13,Regression Analysis & ANOVA,14,Example 1 之迴歸線,Regression Analysis & ANOVA,15,Example
8、1 之 Excel 報表,Regression Analysis & ANOVA,16,Example 1 適合度檢定,,Regression Analysis & ANOVA,17,Example 1 適合度檢定,Regression Analysis & ANOVA,18,複迴歸,Regression Analysis & ANOVA,19,Matrix Approach (I),,Regressio
9、n Analysis & ANOVA,20,Regression Analysis & ANOVA,21,* k = p - 1,Regression Analysis & ANOVA,22,Regression Analysis & ANOVA,23,Regression Analysis & ANOVA,24,Regression Analysis & ANOVA,25,迴規(guī)模式適合度
10、檢定,normal probability plot of residuals standardize outlier,Regression Analysis & ANOVA,26,Regression Analysis & ANOVA,27,Regression Analysis & ANOVA,28,Regression Analysis & ANOVA,29,Regression Anal
11、ysis & ANOVA,30,Regression Analysis & ANOVA,31,Regression Analysis & ANOVA,32,Regression Analysis & ANOVA,33,Key Concepts and Formulas,I.A Linear Probabilistic Model1.When the data exhibit a linear re
12、lationship, the appropriate model is y = a + b x + e . 2.The random error e has a normal distribution with mean 0 and variance s 2.II.Method of Least Squares1.Estimates a and b, for a and b, are chosen to mi
13、nimize SSE,The sum of the squared deviations about the regression line,,Regression Analysis & ANOVA,34,2.The least squares estimates are b = Sxy / Sxx and III.Analysis of Variance1. Total SS = S
14、SR + SSE, where Total SS = Syy and SSR = (Sxy)2 / Sxx.2. The best estimate of s 2 is MSE = SSE / (n - 2).IV.Testing, Estimation, and Prediction1.A test for the significance of the linear regression—H0 : b = 0
15、can be implemented using one of the two test statistics:,Regression Analysis & ANOVA,35,2.The strength of the relationship between x and y can be measured usingwhich gets closer to 1 as the relationship gets
16、 stronger.3.Use residual plots to check for nonnormality, inequality of variances, and an incorrectly fit model.4.Confidence intervals can be constructed to estimate the intercept a and slope b of the regressio
17、n line and to estimate the average value of y, E( y ), for a given value of x.5.Prediction intervals can be constructed to predict a particular observation, y, for a given value of x. For a given x, prediction
18、intervals are always wider than confidence intervals.,Regression Analysis & ANOVA,36,V.Correlation Analysis1.Use the correlation coefficient to measure the relationship between x and y when both variables are
19、 random:2.The sign of r indicates the direction of the relationship; r near 0 indicates no linear relationship, and r near 1 or -1 indicatesa strong linear relationship.3.A test of the significance of the co
20、rrelation coefficient is identical to the test of the slope b.,Regression Analysis & ANOVA,37,Cause and Effect,X could cause YY could cause XX and Y could cause each otherX and Y could be caused by a third varia
21、ble ZX and Y could be related by chanceBad (or good) luckNeed careful examination of the study. Try to find previous evidences or academic explanations.,Regression Analysis & ANOVA,38,Multiple Linear Regression
22、,Multiple Regression ModelA regression model that contains more than one regressor variable.Multiple Linear Regression ModelA multiple regression model that is a linear function of the unknown parameters b0, b1, b2, a
23、nd so on.Examples:Nonlinear:,Regression Analysis & ANOVA,39,Intercept: b0Partial regression coefficients: b1, b2,Regression Analysis & ANOVA,40,Interaction: b12 can be viewed and analyzed as a new parameter
24、 b3 (Replace x12 by a new variable x3),Regression Analysis & ANOVA,41,Interaction: b11 can be viewed and analyzed as a new parameter b3 (Replace x2 by a new variable x3),Regression A
25、nalysis & ANOVA,42,The Analysis Procedure,When you perform multiple regression analysis, use a step-by-step approach:1.Obtain the fitted prediction model.2.Use the analysis of variance F test and R 2 to determi
26、ne how well the model fits the data.3.Check the t tests for the partial regression coefficients to seewhich ones are contributing significant information in the presence of the others.4.If you choose to compa
27、re several different models, use R 2(adj) to compare their effectiveness5.Use-computer generated residual plots to check for violation of the regression assumptions.,Regression Analysis & ANOVA,43,The quadrati
28、c model is an example of a second-order model because it involves a term whose components sum to 2 (in this case, x2 ).It is also an example of a polynomial model—a model that takes the form,A Polynomial Regression Mode
29、l,Regression Analysis & ANOVA,44,Using Quantitative and Qualitative Predictor Variables in a Regression Model,The response variable y must be quantitative.Each independent predictor variable can be either a quantita
30、tive or a qualitative variable, whose levels represent qualities or characteristics and can only be categorized.We can allow a combination of different variables to be in the model, and we can allow the variables to int
31、eract.A quantitative variable x can be entered as a linear term, x, or to some higher power such as x 2 or x3 .You could use the first-order model:,Regression Analysis & ANOVA,45,We can add an interaction term and
32、 create a second-order model:Qualitative predictor variable are entered into a regression model through dummy or indicator variables.If each employee included in a study belongs to one of three ethnic groups—say, A, B
33、, or C—you can enter the qualitative variable “ethnicity” into your model using two dummy variables:,Regression Analysis & ANOVA,46,The model allows a different average response for each group.b 1 measures the diffe
34、rence in the average responses between groups B and A, while b 2 measures the difference between groups C and A. When a qualitative variable involves k categories, (k - 1) dummy variables should be added to the regressio
35、n model.,Regression Analysis & ANOVA,47,If the range of the residuals increases as increases and you know that the data are measurements of Poisson variables, you can stabilize the variance of the response by runn
36、ing the regression analysis onIf the percentages are calculated from binomial data, you can use the arcsin transformation,If E(y) and a single independent variable x are linearly related, and you fit a straight line to
37、 the data, then the observed y values should vary in a random manner about and a plot of the residuals against x will appear as shown in the next page.If you had incorrectly used a linear model to fit the data, the
38、residual plot would show that the unexplained variation exhibits a curved pattern, which suggests that there is a quadratic effect that has not been included in the model.,Regression Analysis & ANOVA,48,Stepwise Regr
39、ession Analysis,Try to list all the variables that might affect a college freshman’s GPA:-Grades in high school courses, high school GPA, SAT score, ACT score-Major, number of units carried, number of courses tak
40、en-Work schedule, marital status, commute or live on campusA stepwise regression analysis fits a variety of models to the data, adding and deleting variables as their significance in the presence of the other variabl
41、es is either significant or nonsignificant, respectively.Once the program has performed a sufficient number of iterations and no more variables are significant when added to the model, and none of the variables are nons
42、ignificant when removed, the procedure stops.These programs always fit first-order models and are not helpful in detecting curvature or interaction in the data.,Regression Analysis & ANOVA,49,Selection of Variables
43、in Multiple Regression,All Possible RegressionsR2p or adj R2p MSE(p)CpStepwise RegressionStart with the variable with the highest correlation with Y.Forward SelectionBackward Selection,Regression Analysis & A
44、NOVA,50,Misinterpreting a Regression Analysis,A second-order model in the variables might provide a very good fit to the data when a first-order model appears to be completely useless in describing the response variable
45、y.CausalityBe careful not to deduce a causal relationship between a response y and a variable x.MulticollinearityNeither the size of a regression coefficient nor its t-value indicates the importance of the variable
46、 as a contributor of information. This may be because two or more of the predictor variables are highly correlated with one another; this is called multicollinearity.,Regression Analysis & ANOVA,51,Multicollinearity
47、can have these effects on the analysis:-The estimated regression coefficients will have large standarderrors, causing imprecision in confidence and prediction intervals.-Adding or deleting a predictor variable
48、may cause significant changes in the values of the other regression coefficients.How can you tell whether a regression analysis exhibits multicollinearity?-The value of R 2 is large, indicating a good fit, but the
49、 individual t-tests are nonsignificant.-The signs of the regression coefficients are contrary to what you would intuitively expect the contributions of those variables to be.-A matrix of correlations, g
50、enerated by the computer, shows you which predictor variables are highly correlated with each other and with the response y.,Regression Analysis & ANOVA,52,The last three columns of the matrix show significant co
51、rrelations between all but one pair of predictor variables:,Regression Analysis & ANOVA,53,,實驗目的:對y影響最大的變數為何?如何設定x1, x2, …, xp使y值趨近最佳值?如何設定x1, x2, …, xp使y值得變異最?。咳绾卧O定x1, x2, …, xp使不可控制因素z1, z2, …, zp之影響最???,實驗設計(DO
52、E)簡介,Regression Analysis & ANOVA,54,一般實驗進行方式,Best-guess approachNo Good, Guess AgainGood Enough, Stop!On-factor-at-a-timeSelecting a baseline starting pointInteractions ruin everything,Regression Analysis &
53、 ANOVA,55,實驗計劃法(DOE),在一個或連串的試驗中刻意地改變製程輸入參數值, 以便觀察並找出影響製程輸出變數之因素.應用:改進製程產出率降低製程變異, 改善產品品質降低研發(fā)時間降低總體成本評估各種可行之設定值評估各替代原料確定影響產品特性之因素,Regression Analysis & ANOVA,56,Example:Optimizing a Process,Regression Analy
54、sis & ANOVA,57,基本原則,複製(Replication)估計自然誤差中央極限定理隨機化(Randomization)“Averaging out” the effects from uncontrollable variables區(qū)隔化(Blocking)增進實驗之精確度,Regression Analysis & ANOVA,58,DOE之程序,問題之認知與陳述選擇因子與其水準選擇反應
55、變數選擇適當之實驗設計執(zhí)行實驗資料分析結論與建議Follow-up run and confirmation testIterativeNo more than 25% of available resources should be invested in the first experiment,Regression Analysis & ANOVA,59,Notes,使用統(tǒng)計以外之專業(yè)知識實驗之設計與分析應
56、愈簡單愈好實驗之統(tǒng)計分析結果與現實上之差異成本技術時間實驗通常是遞迴式的前幾次實驗通常只是學習經驗而已,Regression Analysis & ANOVA,60,實驗設計之種類,單因子實驗設計Variance Model單因子區(qū)隔設計二因子實驗設計二水準階層實驗設計二水準部分階層實驗設計三水準階層實驗設計三水準部分階層實驗設計反應曲面技術,Regression Analysis & ANO
57、VA,61,因子篩選(Screening Experiments)二水準部分階層實驗設計Plackett-Burman DesignGroup-Screening Designs特定區(qū)間二水準階層實驗設計二水準部分階層實驗設計三水準階層實驗設計三水準部分階層實驗設計混合設計最佳化(Optimizing)反應曲面技術,實驗設計之種類(Another Prospect),Regression Analysis &
58、; ANOVA,62,,變異數分析(ANOVA),The Model其中:yij為第(ij)個觀測值m為整體平均數ti為第i個因子水準效應eij為隨機誤差~N(0, s2)Fixed Effects Model Vs. Random Effects Model,Regression Analysis & ANOVA,63,其中:,Regression Analysis & ANOVA,64,假設檢定,若
59、拒絕H0 ,則不同之因子水準對反應變數有影響。反之,則無影響。,Regression Analysis & ANOVA,65,ANOVA表格,Treatment,Treatment,Treatment,(Treatment),Regression Analysis & ANOVA,66,其中:a 為因子之水準數 n 為每一水準之資料個數(複製次數),所以,又,且,Treatment,Treatment,Tr
60、eatment,Treatment,Treatment,Treatment,,Regression Analysis & ANOVA,67,決策模式,若 F0 > Fa,a-1,a(n-1) ,則不同之因子水準對反應變數有影響。反之,則無影響。a 為相對風險。,Regression Analysis & ANOVA,68,Example:紙張強度之研究,Regression Analysis & A
61、NOVA,69,Treatment,Treatment,Regression Analysis & ANOVA,70,因為 F0 > F0.01,3,20 =4.96,所以,在a = 0.01下,不同之因子水準對反應變數有影響。亦即,有足夠的證據證明,Hardwood之含量對紙張之強度有影響。,ANOVA 表格,Regression Analysis & ANOVA,71,盒形圖(Box Plot),Regress
62、ion Analysis & ANOVA,72,殘值分析 (Residual Analysis),確定殘差(Residual)來自於自然變異N(0,s2),Regression Analysis & ANOVA,73,Residual Analysis I – Normality Plot,,Regression Analysis & ANOVA,74,Residual Analysis II,Residual
63、Vs. Factor Levels (Treatment),Regression Analysis & ANOVA,75,Residual Analysis III,Residual Vs. Estimates,Regression Analysis & ANOVA,76,各因子水準平均值之信賴區(qū)間,MSE is the best estimate for s2.The 100*(1-a)% C.I. on mi is
64、Example: Find the 95% C. I. on Hardwood Concentration = 15%?,Regression Analysis & ANOVA,77,The Variance Model,當因子之全部可能水準(水準個數較多時)皆為研究之範圍時,吾人可利用Variance Model之方法來得知此因子之影響程度。步驟:從此因子所有可能因子水準中,隨機抽樣a個水準。利用ANOVA表
65、求得MSE與MSTreatment。此因子之外的自然變異:此因子所造成的變異:製程的整體變異:,Regression Analysis & ANOVA,78,Example,RandomSelect,,計算此因子對製程整體變異之重要性:,Regression Analysis & ANOVA,79,由此可知,若消除此因子所造成的製程變異,則整體製程變異將由8.86降至1.90,如下頁之圖形所示。,Regres
66、sion Analysis & ANOVA,80,Regression Analysis & ANOVA,81,單因子區(qū)隔設計(Blocking Design),若實驗之資料來自於多個操作員或多臺機器時,則可利用區(qū)隔化之方式,將不同操作員(或機器)所產生之影響區(qū)隔開來。 Example:化學成分對布料強度之影響,Regression Analysis & ANOVA,82,ANOVA表之架構,其中bj為第j個區(qū)
67、隔造成之效應。,Regression Analysis & ANOVA,83,Example: (續(xù)),Regression Analysis & ANOVA,84,決策模式:因為F0 = 75.13 >> F0.01,3,12 = 5.95,所以,不同的化學成分對布料之強度有影響。,Example: (續(xù)),Regression Analysis & ANOVA,85,拉丁方格設計 (Latin
68、Square Design),需要兩個區(qū)隔方式時,如原物料與操作員,則可以選擇使用拉丁方格設計。Example: 五個配方A, B, C, D, E,Regression Analysis & ANOVA,86,Regression Analysis & ANOVA,87,ANOVA表 – Latin Square Design,Regression Analysis & ANOVA,88,Regression
69、 Analysis & ANOVA,89,Regression Analysis & ANOVA,90,常見的拉丁方格設計,需要三個區(qū)隔方式時,可使用Graeco-Latin Square Design。(略),Regression Analysis & ANOVA,91,二因子實驗設計,二因子無交互作用,Regression Analysis & ANOVA,92,二因子有交互作用,Regression
70、 Analysis & ANOVA,93,One-factor at a time 之方法,Regression Analysis & ANOVA,94,Regression Analysis & ANOVA,95,二因子實驗設計之模式,Regression Analysis & ANOVA,96,ANOVA表 – Two-Factor Factorial,Regression Analysis &
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
- 5. 眾賞文庫僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
- 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 變異數分析
- 第十一章變異數分析
- 相關分析與回歸分析
- 相關分析與回歸分析
- 血氣分析標本采集與參數分析
- ex12母體比例與母體變異數的區(qū)間估計
- 基于懲罰回歸的縱向數據罕見變異關聯分析.pdf
- 基于分位數回歸的空氣質量指數分析
- 區(qū)位分析與規(guī)劃區(qū)位指數分析
- 心率變異信號的非線性動力學參數分析.pdf
- 生存分析與cox回歸
- 相關與回歸分析方法
- 變異與進化教材分析
- pid參數分析
- 回歸分析課程設計--主成分回歸分析
- 《相關與回歸分析》word版
- 分子遺傳變異數據處理平臺的構建.pdf
- 生存分析報告與cox回歸
- 假設檢驗、回歸分析與方差分析
- 貿易自由化對經濟的影響回歸系數分析證明【外文翻譯】
評論
0/150
提交評論