import React from 'react';
import { Container, Row, Col } from 'react-bootstrap';
import SiteCard from '../../components/SiteCard';
import SEOComponent from '../../components/SEOComponent';

const Methodology = ({ url }) => {
  return (
    <>
      <SEOComponent 
        title={'Modernising House Price Indices with Machine Learning'}
        description={'Modernising House Price Indices with Machine Learning'}
        url={url}
        imageUrl={'https://otta.property/blog_images/how_to/England.png'}
      />
      <Container>
        <SiteCard 
          header={'otta.property index'}
          title={'Making a modern house price index using Machine Learning'}
          content={
            <Container>    
              <Row>
                <Col md={10} style={{margin: 'auto'}}>
                <img src="blog_images/july_index/england_years.png" alt="England" style={{width: '100%'}}/>
                  <h2>The Foundation</h2>
                  <hr/>
                  <p>
                    The Office for National Statistics established their hedonic regression methodology in 2016, treating properties as bundles of characteristics that contribute to price. Their approach carefully considers location, property type, number of rooms, floor area, and neighbourhood classification, among other factors. This foundation has served as a robust framework for understanding house price movements, but modern machine learning techniques offer opportunities to enhance and streamline this methodology.
                  </p>

                  <h2>Building on Traditional Methods</h2>
                  <hr/>
                  <p>
                    Our machine learning approach maintains the core principles of the ONS methodology while modernising the implementation. We use many of the same fundamental property characteristics - location, type, floor area, and build status - but process them differently. Where the ONS uses discrete categories and linear relationships, our models automatically detect and adapt to non-linear patterns in the data. This means the system can naturally handle cases where the relationship between, say, floor area and price isn't strictly proportional, or where the impact of location varies in complex ways across different neighbourhoods.
                  </p>

                  <h2>The Methodology in Detail</h2>
                  <hr/>
                  <p>At its core, our approach processes property data through several stages:</p>
                <ul>
                    <li>Data is combined from both the Land Registry and the Energy Performance Certificate (EPC) database</li>
                    <li>Data transformation using logarithmic scaling for price normalisation</li>
                    <li>Geographic clustering with 0.005 degree precision</li>
                    <li>80-20 train-test split for model validation</li>
                    <li>Ensemble of models for each month</li>
                  </ul>

                  <p>The index construction process includes:</p>
                  <ul>
                    <li>Annual property baskets for consistency</li>
                    <li>Previous year basket predictions for monthly updates</li>
                    <li>Special January handling using two-year lagged baskets</li>
                    <li>Geometric mean aggregation for monthly indices</li>
                  </ul>

                  <h2>Geographic Innovation</h2>
                  <hr/>
                  <p>
                    Perhaps the most significant evolution comes in how we handle location. Rather than using discrete location categories and ACORN classifications, our approach leverages precise latitude and longitude coordinates. This allows the model to discover natural neighbourhood boundaries and micro-location effects that might be missed in traditional postcode-based groupings. The model can learn, for instance, that properties on one side of a street might command different prices than those on the other, or that the impact of being near a train station varies depending on the type of neighbourhood.
                  </p>

                  <h2>The Power of Ensembles</h2>
                  <hr/>
                  <p>
                    Where the ONS relies on a single carefully weighted model, our approach employs an ensemble of models with varying sizes:
                  </p>
                  <ul>
                    <li>100 models for recent months</li>
                    <li>10 models for historical periods</li>
                    <li>Natural confidence intervals from model variation</li>
                    <li>Continuous quality tracking through multiple metrics</li>
                  </ul>
                  <h3>Model performance</h3>
                                    <h5>An example month</h5>
                                    <img src="/blog_images/how_to/2006-01_model.png" alt="England" style={{width: '80%'}}/>
      
                          

                  <h2>Temporal Adaptability</h2>
                  <hr/>
                  <p>
                    Traditional methods require periodic manual reviews and weight adjustments to stay current. Our machine learning approach, in contrast, automatically adapts to changing market dynamics through monthly model updates. The use of Machine Learning allows the models to capture complex interactions between features - for instance, how the value impact of floor area might vary differently across property types in different locations.
                  </p>

                  <h2>Quality and Speed</h2>
                  <hr/>
                  <p>Key advantages include:</p>
                  <ul>
                    <li>Results within minutes of new data availability</li>
                    <li>Parallel processing for multiple models</li>
                    <li>Under one hour processing for England and Wales</li>
                    <li>Automated quality control monitoring</li>
                  </ul>

                  <h2>The Evolution of Understanding</h2>
                  <hr/>
                  <p>
                    This machine learning approach represents a natural evolution in how we understand property markets. It retains the careful statistical principles developed by the ONS while leveraging modern computational techniques to provide faster, more granular insights. By combining traditional statistical wisdom with contemporary machine learning capabilities, we create a system that can capture the complex dynamics of property markets while maintaining the some of the key principles of the ONS methodology.
                  </p>

                  <p>
                    The result isn't a replacement for existing indices but rather a complementary tool that can provide more immediate insights into market movements. It demonstrates how machine learning can enhance rather than replace traditional statistical approaches, providing a bridge between established methodological principles and cutting-edge computational capabilities.
                  </p>
                </Col>
              </Row>
            </Container>
          }
        />
      </Container>
    </>
  );
};

export default Methodology;