Skip to main content
Advanced Search
Search Terms
Content Type

Exact Matches
Tag Searches
Date Options
Updated after
Updated before
Created after
Created before

Search Results

165 total results found

GLM for Multinomial Outcomes

Generalized Linear Models

Multinomial outcomes are much akin to binomial outcomes, with added complexity due to outcomes with more than 2 levels. In such cases it can be difficult to determine an 'order' to the outcomes. Log-linear models can be used for analysis of this type of data....

Multi-Level Modeling

Analysis of Correlated Data

Recall the core of mixed models is that they incorporate fixed and random effects. While single  level models assume one variance, subjects within the same level are correlated in terms of σ0j2 + σ1j2Xij + εij Where y is an N*1 column vector of the outcomeX i...

Non-Inferiority in Clinical Trials

Applied Statistics in Clinical Trials

Usually clinical trials should show if a new treatment is superior to placebo or no treatment, but as we've previously discussed it is not always ethical to give out a placebo when an effective treatment has been identified. The goal of non-inferiority trials...

Effect Modification and Interaction

Applied Statistics in Clinical Trials

Interaction is when a treatment effect is different across different subgroups of the population defined by a baseline covariate. Commonly examined effect modifiers include: demographic variables, study location, or baseline prognostic factors. If an interact...

Multiple Imputation

Analysis of Correlated Data

If no missing data is present our statistical methods provide valid inference only if the following assumptions are met: For Generalized Estimating Equations, the mean function is correctly specified For likelihood-based methods, the probability density fu...

Mutlivariate and Joint Models for Longitudinal Data

Analysis of Correlated Data

Longitudinal studies are commonly designed in many research fields in order to see changes over a time interval shared by all participants. Joint modeling consists of two interlinked sub-models with any type of outcome (continuous, binomial, etc). One of the m...

Interim Analysis and Data Monitoring

Applied Statistics in Clinical Trials

Clinical trials are often longitudinal in nature. It is often impossible to enroll all subjects at the same time, so it can take a long time to complete a longitudinal study. Over the course of the trial one needs to consider administrative monitoring, safety ...

GLM for Count Data

Generalized Linear Models

Generalized linear models for count data are regression techniques available for modeling outcomes describing a type of discrete data where the occurrence might be relatively rare. A common distribution for such a random variable is Poisson. The probability t...

Time Series Models

Analysis of Correlated Data

While standard regression we must assume observations are independent from one another, but with time series data we expect that neighboring observations are correlated. Time series analysis helps organizations understand the underlying causes of trends or sys...

Correlated Data in Clincal Trials

Applied Statistics in Clinical Trials

Note: My BS857 Notebook on Correlated Data goes much further in depth than the below. So far we have focused on independent outcomes in clinical trials, but often times we work with correlated or non-independent outcomes in clinical trials; Such as crossover ...

Gamma Regression

Generalized Linear Models

Consider a continuous dependent variable that is positive-valued, such as a length of a hospital stay, time waiting or the cost of a bill. This type of data is continuous in nature and oftentimes skewed and a normal approximation does not hold. The type of da...

Survival - Time to Failure

Generalized Linear Models

Analysis of survival data is more complex than than other methods we've seen so far; We can't just take the mean survival time a a confidence interval to predict when the last patient will die. Also, survival times are unlikely to follow a Normal distribution,...

GLM for Correlated Data

Generalized Linear Models

So far the models we've covered assume independence between observations collected on separate individuals. When observation are correlated models that incorporate the existing correlation in the data should be employed. There are many approaches are proposed ...

The Basics of Design

D3.js

Data visualizations should be easy to interpret and look credible. To do this there are several factors that be kept in focus, called Edward Tufte's Six Design Principals of Graphical Integrity[1]: The representation of numbers on the graphic should be prop...

Dynamic and Interactive Content

D3.js

Thus far we've looked at building static content, but the backbone of D3.js are it's beautiful transitions and dynamic updating capabilities. Intervals We need some way of repeatedly running code to change something the chart reacts to.  The easiest way to d...

Layouts and Structured Data

D3.js

Now that I've covered the basics of programming in D3, let's take a look at some of the other cool things one can build with D3. Before jumping into the code, it's worth mentioning the resources available within the D3 community for sharing reusable code. As o...

File Structure and Linked Views

D3.js

After adding a lot of different event listeners, the JavaScript file can become messy. This section focuses on writing readable code in an 'Object Oriented' way for larger projects (but OOP will not be covered in depth here). Once a class is set up for a visua...

Intro to Spark / RDDs

Scala + Spark

Apache Spark Spark is a fast and general engine for large-scale data processing. The user writes a Driver Program containing the script that tells spark what to do with your data, and Spark builds a Directed Acyclic Graph (DAG) to optimize workflow. With a ma...

DataFrames and Advanced Techniques

Scala + Spark

A Spark DataSet is an extension of the RDD object. It has rows, can run queries, and has a schema (which leads to more efficient storage and optimization). A DataFrame is just a DataSet of Row Objects, and unlike a DataSet the schema is inferred at runtime rat...

Scala

Scala + Spark

I'll start with a disclaimer: These are notes written by an experienced Java dev, thus some level of basic programming knowledge is required I cannot possibly cover every unique feature of Scala, only what's most important to me Awesome and free tutorials...