Blog Archives

Least Squares Linear Regression

Least-squares regression is a methodology for finding the equation of a best fit line through a set of data points. It also provides a means of describing how well the data correlates to a linear relationship. An example of data with a general linear trend is seen in the above graph. First, we will go over the derivation of the formulas from theory and then I have also appended at the end of this post Scilab code for implementation of the algorithm.

The equation of a line through a data point can be written as:

The value of any data points that are not directly on the line but are in the proximity of the line can be given by:

Where e is the vertical error between the y-value given by the line and the actual y-value of the data. The goal would be to come up with a line which minimizes this error. In least-squares regression, this is accomplished  by minimizing the sum of the squares of the errors. The sum of the squares of the errors is given by:

In order to minimize this value, the minimum finding techniques of differential calculus will be used. First take the derivative with respect to the slope.

Then with respect to the y-intercept yields:

Which can be substituted in the previous equation to solve for the slope.

The y-intercept is then:

It can be seen that these last two formulas only require knowledge about the data point coordinates and the number of points and the equation for the least squares linear regression line can be found.

Finally, below is the Scilab code implementation.

//the linear regression function takes x-values and
//y-values of data in the column vectors 'X' and 'Y' and finds
//the best fit line through the data points. It returns
//the slope and y-intercept of the line as well as the
//coefficient of determination ('r_sq').

//the function call for this should be of the form:
//'[m,b,r2]=Linear_Regression(x,y)'
function [slope, y_int, r_sq]=Linear_Regression(X, Y)
    //determine the number of data points
    n=size(X,'r');

    //initialize each summation
    sum_x=0;
    sum_y=0;
    sum_xy=0;
    sum_x_sq=0;
    sum_y_sq=0;

    //calculate each sum required to find the slope, y-intercept and r_sq
    for i=1:n
        sum_x=sum_x+X(i);
        sum_y=sum_y+Y(i);
        sum_xy=sum_xy+X(i)*Y(i);
        sum_x_sq=sum_x_sq+X(i)*X(i);
        sum_y_sq=sum_y_sq+Y(i)*Y(i);
    end

    //determine the average x and y values for the
    //y-intercept calculation
    x_bar=sum_x/n;
    y_bar=sum_y/n;

    //calculate the slope, y-intercept and r_sq and return the results
    slope=(n*sum_xy-sum_x*sum_y)/(n*sum_x_sq-sum_x^2);
    y_int=y_bar-slope*x_bar;
    r_sq=((n*sum_xy-sum_x*sum_y)/(sqrt(n*sum_x_sq-sum_x^2)*sqrt(n*sum_y_sq-sum_y^2)))^2;

    //determine the appropriate axes size for plotting the data and
    //linear regression line
    axes_size=[min(X)-0.1*(max(X)-min(X)),min(Y)-0.1*(max(Y)-min(Y)),max(X)+0.1*(max(X)-min(X)),max(Y)+0.1*(max(Y)-min(Y))];

    //plot the provided data
    plot2d(X,Y,style=-4,rect=axes_size);

    //plot the calculated regression line
    plot2d(X,(slope*X+y_int));
endfunction

I hope this proves helpful. Let me know in the comments if you have any questions.

Related Posts:

Maximizing and Minimizing

The Bisection Method Using Scilab

Structural Finite Element Analysis Software Installation


			

Structural Finite Element Analysis User’s Guide

Now that you have downloaded and installed Scilab, the Structural Finite Element Analysis (SFEA) program, and got it up and running, how do you use it. Well that’s what we want to go through at this point. What you should see on your desktop are two blank screens. The one on the left will provide the user input controls while the one of the right will graphically depict the structure being analyzed. If this is not the case then you should read here first.

Getting Started: The File Menu

In the left screen there is a ‘File’ menu. If you click it you will see that you can select ‘New’ or ‘Close’. ‘Close’ is pretty obvious. ‘New’ will give you a sub-menu of three choices: Beam, Frame, or Truss. To make clear the difference, in this program an element of a beam supports transverse and moment forces, a frame supports axial, transverse and moment forces, and a truss element supports axial and transverse forces.

To give an example of why this might be important, consider a simple cantilever beam with a non-orthogonal loading. (In other words, the load does not meet the beam’s longitudinal axis at 90 degrees) Now although it is a ‘beam’ due to its form and positioning, since the load will produce an axial force in the structure, it is not a ‘beam’ with regard to the methodology of this program. If you were to analyze this structure, you would select ‘Frame’.

Once you make this selection, the left window will populate the controls. You will notice that there are two frames. The top frame walks the user through the step-by-step acquisition of the necessary data. The bottom frame contains the controls for obtaining the results of node displacements, support reactions, and elemental or member forces.

Entering Node Data

Now that the controls are up, the user simply starts at the top left and goes across and down just like reading text. Start by entering the number of nodes in the text box at the top and then click the ‘Node Coordinates’ button. This brings up a dialog box that asks the user if they want to enter the nodes ‘Singularly’, ‘By Sets’, or ‘From File’. ‘Singularly’ means manually enter the node coordinates one at a time. ‘By Sets’ means the user will enter some criteria about the nodes that is repetitive and the computer will determine the nodes for you. This is a helpful function when you are entering a large number of nodes that follow some consistent pattern with regard to their geometric arrangement. ‘From File’ means the user can open and import the coordinates from a user created text file that contains all the node coordinates.

Singularly

If the user selects this method, a new dialog box will open to let the user input the node coordinates.

The number of rows in the dialog box will dynamically correspond to the number of nodes the user entered would be in the structure.

By Sets

If the user selects this method, another dialog will appear to ask the user how many sets they have to enter. This is a simple integer input.

After entering the number of sets, another dialog box will be displayed to obtain the set data from the user.

The rows in the dialog box correspond to the number of sets. For each set the user must enter the node number to start with, how the node     number for each subsequent node should increment from the beginning node, the beginning x-y coordinate for the first node, the x and y distance each subsequent node will increment from the previous one, and how many nodes in the set.

For example if I was analyzing a simple cantilever beam and I wanted to divide the beam into 20 separate elements, then I would have 21 nodes. Since I would space the nodes out evenly along the length of the beam, I could enter those nodes by sets without having to enter them one at a time. If the length of the entire beam is 120 inches (10 feet), then I would have 20 elements each 6 inches long. So in entering the data by sets in the dialog above I would input: node number start = 1, node number increment = 1, x start = 0, y start = 0, x increment = 6, y increment 0, number of nodes in set = 21. The program would generate 21 nodes starting at the x-y coordinate (0,0) and at every 6 inches horizontally for the full length of the beam.

From File

If the user selects this method, a standard Windows dialog box will open where you can search in the file tree structure to find the text file containing the node coordinates data. The text file should be formatted with one node coordinate per line starting with the x-coordinate followed by a space followed by the y-coordinate.

Which ever method the user selects, once the data has been provided to the program, it will output the node coordinate matrix to the Scilab console. The user can verify the correct coordinates numerically there. Additionally, the nodes will be plotted graphically in the second SFEA window. This allows the user to verify the nodes visually.

Entering Support Reactions Data

Once the nodes have been entered, the user can set the boundary conditions for the structure by pushing the ‘Select Reactions’ button. A dialog box will then open and allow the user to select the support type at each node.

Notice the default node type is free, so the user only has to define those nodes which are supports. An ‘X-Roller’ allows rotation and movement in the y-direction but no movement in the x-direction. Likewise a ‘Y-Roller’ allows rotation and movement in the x-direction but no movement in the y-direction. A pin allows rotations but no x or y movement. Finally a fixed support allows no movement at all.

Once the user defines the node types, the node type matrix will be displayed in the Scilab console and the structure plot will be modified with different symbology for different node types as seen below. The symbols are from left to right, free, x-roller, y-roller, pin, and fixed.

Entering Member or Element Data

Once the nodes and reactions are defined the user can input the members that run between the nodes. In much the same way as the entry of the nodes, the user will enter the number of members in the textbox in the SFEA first window and then push the ‘Member Data’ button. As with the nodes, a dialog box will permit the user to input the data ‘Singularly’, ‘By Sets’, or ‘From File’.

Singularly

If the user selects this method, a dialog box will appear to allow the user to input the data for each member. The data required are the start node number, the end node number, the member’s cross-sectional area (A), the member’s modulus of elasticity (E), and the member’s moment of inertia (I).

By Sets

In the same way as with node entry, the ‘By Sets’ method of member entry allows the user to systematically enter a large number of members with repetitive characteristics. With regard to member entry, each member of a set must have the same A, E, and I.

From File

As before, this allows the user to load the member data from a text file on the computer. The format of the file should be one member per line with each line having: start node number-space-end node number-space-A-space-E-space-I

Once the member data is entered, this data is output to the Scilab console as the connectivity matrix. Also the members will be plotted in the SFEA plot window.

Entering Applied Loads

There are two aspects to the entry of the loads that affect the structure. To understand this fully, let’s define some of the terminology. The first piece of information to enter in the SFEA first window control is the ‘Number of Load Types’. A load type is typically something like dead loads, live loads, wind loads, seismic loads, etc. These are each different load types. If a structure was going to have each of these load types listed then it would have 4 load types. Another term in the program is ‘Number of Load Combinations’. A load combination is a linear combination of the form:

Where each ‘x’ is a load type and each ‘A’ is a load factor. A load factor is something that is typically used in Load and Resistance Factor Design (LRFD) as a form of scaling factor or safety factor. So for example we might have:

Where D stands for dead load and L stands for live load. In many structural applications and building codes there are sets of multiple load combinations and corresponding load factors. Each of these combinations must be calculated and the worst case scenario is then selected for design. The SFEA program provides the capacity for this type of analysis.

Load Types and Loads

Once the user enters the number of load types and pushes the ‘Loads’ button,  a dialog box that allows the entry of the different loads for each type will appear.

Each row in the dialog box corresponds to each node in the structure. Consequently you can enter a horizontal, vertical, and moment load at each node. After pressing ‘OK’ another identical dialog box will appear for the next load type.

The loads matrix will be displayed in the Scilab console after entry. Each column of the matrix is a different load type. The rows follow the following pattern. Each node has 3 potential loads therefore the first 3 rows correspond to the 3 loads on node 1. The next 3 rows correspond to node 2 and so on. Therefore there should be 3N rows where N is the number of nodes. Additionally, the order of the loads in each set of 3 rows is the same as it is entered in the dialog box: horizontal, vertical, moment.

Load Combinations and Factors

As discussed above, the number of load combinations and load factors are often listed in manuals or code books. Once the number of combinations is entered and the user pushes the ‘Load Factors’ button, the following dialog box appears.

Here the user enters the load factors that correspond to each load type for each combination.

Once this data is entered the load factors will be displayed to the Scilab console as well as the ultimate loads. The ultimate loads are the maximum loads at each node calculated from all the given loads, load combinations, and load factors. Also, the ultimate loads will be depicted in the plot window on the structure with red lines or curves. The line goes from the node in the direction of the force. A semicircular curve around the left side of a node is a positive moment according to the right-hand rule while a curve around the right side of a node is negative.

Find Displacements

At this point the structure is fully defined and graphically displayed in the plot window. Now when the user pushes the ‘Find Displacements’ button, the displacements vector will be displayed in the Scilab console and the new displaced shape of the structure will be plotted in blue. The displacements vector follows the same convention as discussed above with the loads matrix. The first 3 rows correspond to the horizontal, vertical, and angular displacements of node 1. The second 3 for node 2 and so on.

Find Reactions

When the user presses this button next, the reactions vector is displayed in the Scilab console. Again, as with the loads and displacements you have the horizontal, vertical, and moment reactions for each node starting with node 1 and going down.

Find Member Forces

Finally when the user pushes this button, the local member forces matrix is displayed in the Scilab console. Each column corresponds to a member. Column 1 for member 1 and so on. The first 3 rows are the horizontal, vertical, and moment forces on the member at the starting node while rows 4,5, and 6 are the forces on the member at the ending node. It is important to understand that these forces are with reference to the member’s local axis. The local axis for each member has its origin at the starting node with the horizontal or x-axis directed toward the ending node. The vertical or y-axis is then appropriately orthogonal to the local x-axis.

I hope this user’s guide has been helpful and provided sufficient information or answers to your questions. However if you are still stuck or want further information just let me know in the comments.

The Bisection Method using Scilab

 

Recently, I have really been enjoying learning to use Scilab which is an open-source mathematics software that works alot like Matlab. Specifically we have run into some design problems in water and wastewater engineering that involved solving for variables that are described implicitly by an equation and cannot be solved through typical algebraic means. For example:

This cannot be solved directly for x. So how can we find the value of x that will satisfy the equation? Well this is where the numerical analysis technique of the bisection method comes in. The bisection method is a bounded or bracketed root-finding method. In other words, it will locate the root of an equation provided you give it the interval in which a root is located.

The search for the root is accomplished by the algorithm by dividing the interval in half and determining if the root is in one half or the other. Then it divides that interval in half and so on until it gets within a specified relative error. Below I have entered my code for this algorithm. If you save this file as Bisection_Method.sci in a scilab/modules/…/macros folder and execute it in the SciNotes editor, you can then use it by entering a command in the console like:

root = Bisection_Method(‘x-sinx-3’,0,6.5,0.5)

The first passed parameter is the function equation itself solved such that f(x) = 0. The second parameter is the lower end of the interval while the third is the upper end. The final parameter is the percent relative error that the algorithm much reach with the solution before it terminates and gives a result.

Please feel free to download Scilab for free here and copy this code and use it. Let me know if you have any troubles or if you find a problem with the code.

 

//The Bisection Method is a bracketed root locating method. Therefore
//the user must input the lower and upper bounds on the interval where
//the root is to be located. 
//func_expression - must be a string containing the function expression
//with the variable written as 'x'.
//es - is the relative percent error criterion that must be met for the
//method to terminate.
function [root]=Bisection_Method(func_expression, x_lower, x_upper, es)
    x = x_upper;
    fu = evstr(func_expression); //evaluate the function at x_upper
    x = x_lower;
    fl = evstr(func_expression); //evaluate the function at x_lower
    //***********************************************************
    //test to see if there is a root in the interval
    if (fu*fl >= 0) then
        root = 'no root in interval given';
    else
        //there is a root in the interval
        exact_solution = 'notfound';
        ea = 100;
        xr_new =(x_upper + x_lower)/2;
        //*******************************************************
        //iteratively progress toward to root until it is found or it
        //is approximated with a relative error less than required
        while ea > es & exact_solution == 'notfound',
            x = xr_new;
            fr = evstr(func_expression);
            x = x_lower;
            fl = evstr(func_expression);
            //***************************************************
            if(fl*fr < 0) then
                x_upper = xr_new;
            elseif(fl*fr > 0) then
                x_lower = xr_new;
            elseif(fl*fr == 0) then
                root = xr_new;
                exact_solution = 'found';
            end //of if statement
            //***************************************************
            //calculate the approximate relative error for the iteration
            xr_old = xr_new;
            xr_new =(x_upper + x_lower)/2;
            ea = abs((xr_new - xr_old)/xr_new)*100;
        end //of while loop 
        //*******************************************************
        //if error criterion has been met but the exact answer has not
        //been found, set the root to the adequate approximation.
        if (exact_solution == 'notfound') then
            root = xr_new;
        end //of if statement
        //*******************************************************      
    end //of if-else statement
    //***********************************************************
endfunction