Home > data cleaning > Compute the self excluded sample mean by group

## Compute the self excluded sample mean by group

egen(stata cmd) compute a summary statistics by groups and store it in to a new variable. For example, the data has three variables, id, time and y, we want to compute the mean of y by for each id and then store it as a new variable mean_y.

In stata, the command would be

egen mean_y = mean(y), by(id)

In R, this task can be completed by `ave`

Generate dataset:

```id <- rep(1:3,each=3)
t<-rep(1:3,3)
y<-sample(1:5,9,replace=T)
my_data<-data.frame(id=id,time=t,y=y)
```

Orignal data:

```> my_data
id time y
1  1    1 4
2  1    2 1
3  1    3 4
4  2    1 2
5  2    2 3
6  2    3 3
7  3    1 4
8  3    2 4
9  3    3 3
```
```> within(my_data, {mean_y = ave(y,id)} )
id time y   mean_y
1  1    1 4 3.000000
2  1    2 1 3.000000
3  1    3 4 3.000000
4  2    1 2 2.666667
5  2    2 3 2.666667
6  2    3 3 2.666667
7  3    1 4 3.666667
8  3    2 4 3.666667
9  3    3 3 3.666667
```

The default summary statistics is `mean`. However, we can assign a particular function to compute the summary statistics. For example, if we want to compute the sd of y by id, then we can have

```within(my_data, {sd_y = ave(y,id,FUN=sd)} )
id time y      sd_y
1  1    1 4 1.7320508
2  1    2 1 1.7320508
3  1    3 4 1.7320508
4  2    1 2 0.5773503
5  2    2 3 0.5773503
6  2    3 3 0.5773503
7  3    1 4 0.5773503
8  3    2 4 0.5773503
9  3    3 3 0.5773503
```

Remark: The `within` evaluate an expression in an environment created from the data.frame. In addition, it will modify the data.frame and return it back(in our case, it create new variables, mean_y or sd_y )

Here is another usage of `ave`. We would like to create a self excluded sample mean by group.

Suppose the data has three variables, id, time and y, we want to compute the mean of y by for each id but excluding the value of y of current time period.

```id <- rep(1:3,each=3)
t<-rep(1:3,3)
y<-sample(1:5,9,replace=T)
my_data<-data.frame(id=id,time=t,y=y)
```

Orignal data:

```> my_data
id time y
1  1    1 4
2  1    2 1
3  1    3 4
4  2    1 2
5  2    2 3
6  2    3 3
7  3    1 4
8  3    2 4
9  3    3 3
```

First, we need a function to compute the self excluded mean. This function takes a vector and a function(default is mean) as argument. It apply the function to the vector where one of the element is removed. The return value is a vector that i-th element is given by FUN(x[-i])

```excludeSelfSummary<-function(x,FUN=mean){
sapply(1:length(x), function(i) FUN(x[-i]))
}
> excludeSelfSummary(1:5,mean)
[1] 3.50 3.25 3.00 2.75 2.50
> excludeSelfSummary(1:5,min)
[1] 2 1 1 1 1
> excludeSelfSummary(1:5,max)
[1] 5 5 5 5 4
```

Then we pass the `excludeSelfSummary into ave as argument. `

``` > within(my_data, {sd_y = ave(y,id,FUN=excludeSelfSummary)} ) id time y sd_y 1 1 1 4 2.5 2 1 2 1 4.0 3 1 3 4 2.5 4 2 1 2 3.0 5 2 2 3 2.5 6 2 3 3 2.5 7 3 1 4 3.5 8 3 2 4 3.5 9 3 3 3 4.0 Of course, we could compute the self excluded minimum or maximum. > within(my_data, {sd_y = ave(y,id,FUN=function(x) excludeSelfSummary(x,min) )}) id time y sd_y 1 1 1 4 1 2 1 2 1 4 3 1 3 4 1 4 2 1 2 3 5 2 2 3 2 6 2 3 3 2 7 3 1 4 3 8 3 2 4 3 9 3 3 3 4 __ATA.cmd.push(function() { __ATA.initVideoSlot('atatags-370373-5e5906d8c9fb6', { sectionId: '370373', format: 'inread' }); }); __ATA.cmd.push(function() { __ATA.initDynamicSlot({ id: 'atatags-26942-5e5906d8c9fce', location: 120, formFactor: '001', label: { text: 'Advertisements', }, creative: { reportAd: { text: 'Report this ad', }, privacySettings: { text: 'Privacy settings', } } }); }); Share this:TwitterFacebookLike this:Like Loading... ```
``` Categories: data cleaning ```
``` Comments (0) Trackbacks (0) Leave a comment Trackback No comments yet. No trackbacks yet. Leave a Reply Enter your comment here... Fill in your details below or click an icon to log in: Email (required) (Address never made public) Name (required) Website You are commenting using your WordPress.com account. ( Log Out /  Change ) You are commenting using your Google account. ( Log Out /  Change ) You are commenting using your Twitter account. ( Log Out /  Change ) You are commenting using your Facebook account. ( Log Out /  Change ) Cancel Connecting to %s var highlander_expando_javascript = function(){ var input = document.createElement( 'input' ), comment = jQuery( '#comment' ); if ( 'placeholder' in input ) { comment.attr( 'placeholder', jQuery( '.comment-textarea label' ).remove().text() ); } // Expando Mode: start small, then auto-resize on first click + text length jQuery( '#comment-form-identity' ).hide(); jQuery( '#comment-form-subscribe' ).hide(); jQuery( '#commentform .form-submit' ).hide(); comment.css( { 'height':'10px' } ).one( 'focus', function() { var timer = setInterval( HighlanderComments.resizeCallback, 10 ) jQuery( this ).animate( { 'height': HighlanderComments.initialHeight } ).delay( 100 ).queue( function(n) { clearInterval( timer ); HighlanderComments.resizeCallback(); n(); } ); jQuery( '#comment-form-identity' ).slideDown(); jQuery( '#comment-form-subscribe' ).slideDown(); jQuery( '#commentform .form-submit' ).slideDown(); }); } jQuery(document).ready( highlander_expando_javascript ); Notify me of new comments via email. Notify me of new posts via email. A handy concatenation operator How to do egen (stata cmd) in R ```
``` RSS feed Google Youdao Xian Guo Zhua Xia My Yahoo! newsgator Bloglines iNezha RSS - Posts Blog Stats 28,463 hits Recent Posts Construct an unique index from two integer (Pairing Function) A handy concatenation operator Compute the self excluded sample mean by group How to do egen (stata cmd) in R Generating a lag/lead variables Categories Custom Function data cleaning R programming stata Uncategorized Archives April 2013 February 2013 March 2012 October 2011 September 2011 Custom Function data cleaning R programming stata Uncategorized Search for: Blogroll My website Econometrics Econometric Sense R R-Bloggers Statistics Blog Statistics Statistics Blog __ATA.cmd.push(function() { __ATA.initDynamicSlot({ id: 'atatags-286348-5e5906d8cd3fc', location: 140, formFactor: '003', label: { text: 'Advertisements', }, creative: { reportAd: { text: 'Report this ad', }, privacySettings: { text: 'Privacy settings', } } }); }); ```
``` Top Create a free website or blog at WordPress.com. ```
``` ```
``` var WPGroHo = {"my_hash":""}; //initialize and attach hovercards to all gravatars jQuery( document ).ready( function( \$ ) { if (typeof Gravatar === "undefined"){ return; } if ( typeof Gravatar.init !== "function" ) { return; } Gravatar.profile_cb = function( hash, id ) { WPGroHo.syncProfileData( hash, id ); }; Gravatar.my_hash = WPGroHo.my_hash; Gravatar.init( 'body', '#wp-admin-bar-my-account' ); }); var HighlanderComments = {"loggingInText":"Logging In\u2026","submittingText":"Posting Comment\u2026","postCommentText":"Post Comment","connectingToText":"Connecting to %s","commentingAsText":"%1\$s: You are commenting using your %2\$s account.","logoutText":"Log Out","loginText":"Log In","connectURL":"https:\/\/rlearner.wordpress.com\/public.api\/connect\/?action=request&domain=ctszkin.com","logoutURL":"https:\/\/rlearner.wordpress.com\/wp-login.php?action=logout&_wpnonce=02e1964d98","homeURL":"https:\/\/ctszkin.com\/","postID":"138","gravDefault":"identicon","enterACommentError":"Please enter a comment","enterEmailError":"Please enter your email address here","invalidEmailError":"Invalid email address","enterAuthorError":"Please enter your name here","gravatarFromEmail":"This picture will show whenever you leave a comment. Click to customize it.","logInToExternalAccount":"Log in to use details from one of these accounts.","change":"Change","changeAccount":"Change Account","comment_registration":"0","userIsLoggedIn":"","isJetpack":"","text_direction":"ltr"}; ( function () { var setupPrivacy = function() { var \$ = window.jQuery; if ( ! \$ ) return; // jQuery required \$( document ).ready( function() { // Minimal Mozilla Cookie library // https://developer.mozilla.org/en-US/docs/Web/API/Document/cookie/Simple_document.cookie_framework var cookieLib = {getItem:function(e){return e&&decodeURIComponent(document.cookie.replace(new RegExp("(?:(?:^|.*;)\\s*"+encodeURIComponent(e).replace(/[\-\.\+\*]/g,"\\\$&")+"\\s*\\=\\s*([^;]*).*\$)|^.*\$"),"\$1"))||null},setItem:function(e,o,n,t,r,i){if(!e||/^(?:expires|max\-age|path|domain|secure)\$/i.test(e))return!1;var c="";if(n)switch(n.constructor){case Number:c=n===1/0?"; expires=Fri, 31 Dec 9999 23:59:59 GMT":"; max-age="+n;break;case String:c="; expires="+n;break;case Date:c="; expires="+n.toUTCString()}return"rootDomain"!==r&&".rootDomain"!==r||(r=(".rootDomain"===r?".":"")+document.location.hostname.split(".").slice(-2).join(".")),document.cookie=encodeURIComponent(e)+"="+encodeURIComponent(o)+c+(r?"; domain="+r:"")+(t?"; path="+t:"")+(i?"; secure":""),!0}}; var setDefaultOptInCookie = function() { var value = '1YNN'; cookieLib.setItem( 'usprivacy', value, 365 * 24 * 60 * 60, '/', '.rootDomain' ); }; var setCcpaAppliesCookie = function( value ) { cookieLib.setItem( 'ccpa_applies', value, 24 * 60 * 60, '/', '.rootDomain' ); } var maybeCallDoNotSellCallback = function() { if ( 'function' === typeof window.doNotSellCallback ) { return window.doNotSellCallback( \$ ); } return false; } var usprivacyCookie = cookieLib.getItem( 'usprivacy' ); if ( null !== usprivacyCookie ) { maybeCallDoNotSellCallback(); return; } var ccpaCookie = cookieLib.getItem( 'ccpa_applies' ); if ( null === ccpaCookie ) { \$.ajax({ type: 'GET', dataType: "json", cache: false, url: 'https://public-api.wordpress.com/geo/', success: function( data ) { var ccpa_applies = data['region'] && data['region'].toLowerCase() === 'california'; setCcpaAppliesCookie( ccpa_applies ); if ( ccpa_applies ) { if ( maybeCallDoNotSellCallback() ) { setDefaultOptInCookie(); } } }, error: function() { setCcpaAppliesCookie( true ); if ( maybeCallDoNotSellCallback() ) { setDefaultOptInCookie(); } }, } ); } else { if ( ccpaCookie === 'true' ) { if ( maybeCallDoNotSellCallback() ) { setDefaultOptInCookie(); } } } } ); }; if ( window.defQueue && defQueue.isLOHP && defQueue.isLOHP === 2020 ) { defQueue.items.push( setupPrivacy ); } else { setupPrivacy(); } } )(); ( function( \$ ) { \$( document.body ).on( 'post-load', function () { if ( typeof __ATA.insertInlineAds === 'function' ) { __ATA.insertInlineAds(); } } ); } )( jQuery ); window.WPCOM_sharing_counts = {"https:\/\/ctszkin.com\/2013\/02\/12\/compute-the-self-excluded-sample-mean-by-group\/":138}; Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use. To find out more, including how to control cookies, see here: Cookie Policy (function(){ var corecss = document.createElement('link'); var themecss = document.createElement('link'); var corecssurl = "https://s1.wp.com/wp-content/plugins/syntaxhighlighter/syntaxhighlighter3/styles/shCore.css?ver=3.0.9b"; if ( corecss.setAttribute ) { corecss.setAttribute( "rel", "stylesheet" ); corecss.setAttribute( "type", "text/css" ); corecss.setAttribute( "href", corecssurl ); } else { corecss.rel = "stylesheet"; corecss.href = corecssurl; } document.head.appendChild( corecss ); var themecssurl = "https://s2.wp.com/wp-content/plugins/syntaxhighlighter/syntaxhighlighter3/styles/shThemeDefault.css?m=1363304414h&amp;ver=3.0.9b"; if ( themecss.setAttribute ) { themecss.setAttribute( "rel", "stylesheet" ); themecss.setAttribute( "type", "text/css" ); themecss.setAttribute( "href", themecssurl ); } else { themecss.rel = "stylesheet"; themecss.href = themecssurl; } document.head.appendChild( themecss ); })(); SyntaxHighlighter.config.strings.expandSource = '+ expand source'; SyntaxHighlighter.config.strings.help = '?'; SyntaxHighlighter.config.strings.alert = 'SyntaxHighlighter\n\n'; SyntaxHighlighter.config.strings.noBrush = 'Can\'t find brush for: '; SyntaxHighlighter.config.strings.brushNotHtmlScript = 'Brush wasn\'t configured for html-script option: '; SyntaxHighlighter.defaults['pad-line-numbers'] = false; SyntaxHighlighter.defaults['toolbar'] = false; SyntaxHighlighter.all(); // Infinite scroll support if ( typeof( jQuery ) !== 'undefined' ) { jQuery( function( \$ ) { \$( document.body ).on( 'post-load', function() { SyntaxHighlighter.highlight(); } ); } ); } var actionbardata = {"siteID":"25621776","siteName":"R HEAD","siteURL":"https:\/\/ctszkin.com","icon":"<img alt='' src='https:\/\/s2.wp.com\/i\/logo\/wpcom-gray-white.png' class='avatar avatar-50' height='50' width='50' \/>","canManageOptions":"","canCustomizeSite":"","isFollowing":"","themeSlug":"pub\/inove","signupURL":"https:\/\/wordpress.com\/start\/","loginURL":"https:\/\/wordpress.com\/log-in?redirect_to=https%3A%2F%2Fctszkin.com%2F2013%2F02%2F12%2Fcompute-the-self-excluded-sample-mean-by-group%2F&signup_flow=account&domain=ctszkin.com","themeURL":"","xhrURL":"https:\/\/ctszkin.com\/wp-admin\/admin-ajax.php","nonce":"950bef666a","isSingular":"1","isFolded":"","isLoggedIn":"","isMobile":"","subscribeNonce":"<input type=\"hidden\" id=\"_wpnonce\" name=\"_wpnonce\" value=\"adf6ca1dc8\" \/>","referer":"https:\/\/ctszkin.com\/2013\/02\/12\/compute-the-self-excluded-sample-mean-by-group\/","canFollow":"1","feedID":"6029588","statusMessage":"","customizeLink":"https:\/\/rlearner.wordpress.com\/wp-admin\/customize.php?url=https%3A%2F%2Frlearner.wordpress.com%2F2013%2F02%2F12%2Fcompute-the-self-excluded-sample-mean-by-group%2F","postID":"138","shortlink":"https:\/\/wp.me\/p1Jvos-2e","canEditPost":"","editLink":"https:\/\/wordpress.com\/post\/ctszkin.com\/138","statsLink":"https:\/\/wordpress.com\/stats\/post\/138\/ctszkin.com","i18n":{"view":"View site","follow":"Follow","following":"Following","edit":"Edit","login":"Log in","signup":"Sign up","customize":"Customize","report":"Report this content","themeInfo":"Get theme: INove","shortlink":"Copy shortlink","copied":"Copied","followedText":"New posts from this site will now appear in your <a href=\"https:\/\/wordpress.com\/read\">Reader<\/a>","foldBar":"Collapse this bar","unfoldBar":"Expand this bar","editSubs":"Manage subscriptions","viewReader":"View site in Reader","viewReadPost":"View post in Reader","subscribe":"Sign me up","enterEmail":"Enter your email address","followers":"","alreadyUser":"Already have a WordPress.com account? <a href=\"https:\/\/wordpress.com\/log-in?redirect_to=https%3A%2F%2Fctszkin.com%2F2013%2F02%2F12%2Fcompute-the-self-excluded-sample-mean-by-group%2F&signup_flow=account&domain=ctszkin.com\">Log in now.<\/a>","stats":"Stats"}}; var sharing_js_options = {"lang":"en","counts":"1","is_stats_active":"1"}; ( 'fetch' in window ) || document.write( '<script src="https://s0.wp.com/wp-includes/js/dist/vendor/wp-polyfill-fetch.min.js?m=1573572739h&#038;ver=3.0.0"></' + 'ipt>' );( document.contains ) || document.write( '<script src="https://s1.wp.com/wp-includes/js/dist/vendor/wp-polyfill-node-contains.min.js?m=1540208548h&#038;ver=3.26.0-0"></' + 'ipt>' );( window.FormData && window.FormData.prototype.keys ) || document.write( '<script src="https://s1.wp.com/wp-includes/js/dist/vendor/wp-polyfill-formdata.min.js?m=1550600082h&#038;ver=3.0.12"></' + 'ipt>' );( Element.prototype.matches && Element.prototype.closest ) || document.write( '<script src="https://s2.wp.com/wp-includes/js/dist/vendor/wp-polyfill-element-closest.min.js?m=1540208548h&#038;ver=2.0.2"></' + 'ipt>' ); var windowOpen; jQuery( document.body ).on( 'click', 'a.share-twitter', function() { // If there's another sharing window open, close it. if ( 'undefined' !== typeof windowOpen ) { windowOpen.close(); } windowOpen = window.open( jQuery( this ).attr( 'href' ), 'wpcomtwitter', 'menubar=1,resizable=1,width=600,height=350' ); return false; }); var windowOpen; jQuery( document.body ).on( 'click', 'a.share-facebook', function() { // If there's another sharing window open, close it. if ( 'undefined' !== typeof windowOpen ) { windowOpen.close(); } windowOpen = window.open( jQuery( this ).attr( 'href' ), 'wpcomfacebook', 'menubar=1,resizable=1,width=600,height=400' ); return false; }); // <![CDATA[ (function() { try{ if ( window.external &&'msIsSiteMode' in window.external) { if (window.external.msIsSiteMode()) { var jl = document.createElement('script'); jl.type='text/javascript'; jl.async=true; jl.src='/wp-content/plugins/ie-sitemode/custom-jumplist.php'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(jl, s); } } }catch(e){} })(); // ]]> %d bloggers like this: _tkq = window._tkq || []; _stq = window._stq || []; _tkq.push(['storeContext', {'blog_id':'25621776','blog_tz':'-5','user_lang':'en','blog_lang':'en','user_id':'0'}]); _stq.push(['view', {'blog':'25621776','v':'wpcom','tz':'-5','user_id':'0','post':'138','subd':'rlearner'}]); _stq.push(['extra', {'crypt':'UE5XaGUuOTlwaD85flAmcm1mcmZsaDhkV11YdWtpP0NsWnVkPS9sL0ViLndld3BuVT01Unp2dX5PUExlSmEmeCZpdDk9Q1NPMlFYW1pBXzBRckMwaEpqcE4lS09yZnJKPy5sKzUxfjRpbUFxP0U2VWk3b3ZmKzFxK2Vqc0dZaGJtREddTy4/ZVFXV2tqXXxMbFUwVj9jeEJMVk9kWm9TaXN1YmR2T0oyJW5vdi9RUkM0ekNpd1hXMi1yYm84RSVnZFBnYT1kTV0rTk0malZPdW5LUS9bdGxUR1RTTzg/ViU2T0c/OVV5OHJhTkcycXNQeS5IQytWcmt5WWhpZVVNeGJ5azZfTkgyJXpDSFk4enpDNDVBOV1KLTNIcmotY29Cc1J6JXZdd3JmRVZ+ZTlKODdSWThKYg=='}]); _stq.push([ 'clickTrackerInit', '25621776', '138' ]); if ( 'object' === typeof wpcom_mobile_user_agent_info ) { wpcom_mobile_user_agent_info.init(); var mobileStatsQueryString = ""; if( false !== wpcom_mobile_user_agent_info.matchedPlatformName ) mobileStatsQueryString += "&x_" + 'mobile_platforms' + '=' + wpcom_mobile_user_agent_info.matchedPlatformName; if( false !== wpcom_mobile_user_agent_info.matchedUserAgentName ) mobileStatsQueryString += "&x_" + 'mobile_devices' + '=' + wpcom_mobile_user_agent_info.matchedUserAgentName; if( wpcom_mobile_user_agent_info.isIPad() ) mobileStatsQueryString += "&x_" + 'ipad_views' + '=' + 'views'; if( "" != mobileStatsQueryString ) { new Image().src = document.location.protocol + '//pixel.wp.com/g.gif?v=wpcom-no-pv' + mobileStatsQueryString + '&baba=' + Math.random(); } } ```