365 Data Science

How to ace the data science coding challenge

Preparing to interview for a Data Scientist position takes preparation and practice, and then it could all boil down to a final review of your skills. Based on personal experience, these tips on how to approach such a review will help you excel in the coding challenge project for your next interview.

Originally from KDnuggets https://ift.tt/2GStOyf

source https://365datascience.weebly.com/the-best-data-science-blog-2020/how-to-ace-the-data-science-coding-challenge

Text Mining with R: The Free eBook

This freely-available book will show you how to perform text analytics in R, using packages from the tidyverse.

Originally from KDnuggets https://ift.tt/3iZ04wA

source https://365datascience.weebly.com/the-best-data-science-blog-2020/text-mining-with-r-the-free-ebook

How to Become a Business Analyst?

The business analyst profession is one of the most popular career choices in the business world. The role has plenty to offer not only as a stand-alone career but also as one of the best gateways into the field of analytics and data science. What’s more, the business analyst job allows you to constantly evolve by learning new techniques to address complex business problems and find groundbreaking solutions in today’s technology-driven economy.

So, in this article, we’ll discuss how to become a business analyst.

We’ll discover who the business analyst is, what they do, how much they make, and what skills and academic background you need to become one.

You can also check out our video on the topic below or scroll down to keep on reading.

What Is a Business Analyst and How Do They Fit Inside a Company?

The title “Business Analyst” sounds a bit generic at first and may cause hesitation among entry-level candidates. However, the fact that this is a flexible position could be encouraging as the business analyst role encompasses many different activities in a firm. In fact, depending on the organization, this can be an entry-level position or a role for experienced professionals. This means business analysts engage with a wide variety of tasks and the business analyst job description from one firm to the other can be very different. In different businesses, business analysts will be focused on some of the following types of activities:

Process and systems improvement (in terms of efficiency and effectiveness)
Solving business problems
Looking for savings and efficiencies
Focusing on business development and searching for new opportunities
Performance analysis
Competitor analysis

Business analyst role

Indeed, “business analyst” is one of the most dynamic roles in a company. They may report their findings to the head of a particular division they are assigned to; or alternatively, they might be discussing a specific case with a product or a project manager. And, in some instances, the business analyst serves as the link between the business development manager and the respective head of division or product owner. One thing is for sure – you will never be bored on the job.

That was the elevator pitch of this exciting job role. But to gain a better understanding of what it means to be a business analyst, we need to take a closer look at their typical day-to-day responsibilities.

What Does a Business Analyst Do?

The answer is – it varies.

In different companies, business analysts carry out different activities. But here are some of the most common business analyst roles and responsibilities.

Business analysts analyze the performance of a particular segment in a company. Very often, they engage with the analysis of different processes, defining goals, and formulating hypotheses. Their effort is to optimize the performance of the particular segment of the company they have been assigned to.

They not only collect data but also apply data-driven decision making, communicate findings, and oversee the implementation of these findings.

What’s more, business analysts often conduct training for non-technical team members.

Business analyst role and responsibilities

So, we can safely say that business analysts are the best communicators among problem-solvers and are always ready to lend their expertise across the organization.

The division business analysts are assigned to can be Sales, Supply chain, Administration.

Within that structure, they conduct research, try to rely on data as much as possible, and are typically involved in creating dashboards and other BI tools that allow for easier communication of their findings.

Business analyst tasks and place in the company

What Is a Business Analyst Salary?

How much does a business analyst make? According to the data, a business analyst earns $68,346 on average. So, if you’re making the first steps as an entry level business analyst, you can expect a median salary of $50k a year. Of course, as you gain more experience, your annual pay will also go higher – the senior business analyst salary reaches up to $93k! Pretty awesome, right?

Business analyst salary

What Is the Business Analyst Career Path?

A business analyst job is a great option to explore, both on its own and as the first step on the career ladder to becoming a Product Lead, Head of Product, or Head of Division, and, why not a VP.

Most middle and large companies across all industries – including Consulting, Finance, and Tech – offer full-time business analyst positions.

Consultancy is also very popular for this profession, especially in smaller organizations. However, this option provides a business analyst with a limited view of the business compared to their counterparts employed by a company in a specific industry.

What Business Analyst Skills You Need to Acquire?

We researched 1,395 business analyst jobs to discover the desired tools and skills business analyst candidates must have.

Here’s what the data says:

60% of job postings emphasized Excel skills
41% mentioned strong communication
6% requested Tableau
and 4% – Power BI

Business analyst skills and tools

What Is the Required Business Analyst Degree?

In terms of business analyst education, 66% of the job posts require a Bachelor’s degree. This is the standard for this profession.

Business analyst degree

What Is the Necessary Business Analyst Experience?

Most employers in our sample required an average of 4 years on the job. However, 35% of job ads were also suitable for people with no prior working experience.

Business analyst experience

So, to sum up – if you want to maximize your chances of landing a business analyst job, you definitely need to be proficient in Excel and you need strong communication skills. And, possibly, learning a BI software (like Tableau or PowerBI) could give you the edge too.

Business analyst summary of job requirements

Now you’re aware of the most important aspects of the business analyst position, what to expect from the job, and what skills to focus on to become one.

Nevertheless, if you feel like you still need additional career advice and a more detailed analysis of the career opportunities in data science, check out our course Starting a Career in Data Science: Project Portfolio, Resume, and Interview Process.

Try Starting a Career in Data Science course for free

The post How to Become a Business Analyst? appeared first on 365 Data Science.

from 365 Data Science https://ift.tt/3774Plb

Top KDnuggets tweets Oct 7-13: Every DataFrame Manipulation Explained and Visualized Intuitively

Also Free Introductory Machine Learning Course From Amazon; A Complete Guide to Learn #DataScience in 100 Days; Top 3 Books for Every #DataEngineer.

Originally from KDnuggets https://ift.tt/3lKCi9h

source https://365datascience.weebly.com/the-best-data-science-blog-2020/top-kdnuggets-tweets-oct-7-13-every-dataframe-manipulation-explained-and-visualized-intuitively

Deep Learning Design Patterns

New book, “Deep Learning Design Patterns” presents deep learning models in a unique-but-familiar new way: as extendable design patterns you can easily plug-and-play into your software projects. Use code kdmath50 to save 50% off.

Originally from KDnuggets https://ift.tt/33ZzKxO

source https://365datascience.weebly.com/the-best-data-science-blog-2020/deep-learning-design-patterns

Free From MIT: Intro to Computational Thinking and Data Science

This free course from MIT will help in your transition to thinking computationally, and ultimately solving complex data science problems.

Originally from KDnuggets https://ift.tt/3iV5OHH

source https://365datascience.weebly.com/the-best-data-science-blog-2020/free-from-mit-intro-to-computational-thinking-and-data-science

How to shine in a Data Science ChallengeAn Example from DrivenData

Richter’s Predictor— Data Challenge from DrivenData

Scoring in the top one percent in the Richter’s Predictor: Modeling Earthquake Damage on DrivenData.

Next to Kaggle there are many other websites which host highly relevant and competitive data science competitions. DrivenData is one of these websites. The main difference between the renowned Kaggle and DrivenData is probably the topics of the challenges. Wheras Kaggle hosts more commercially driven competitions, DrivenData focuses more on philanthropic topics.

We, data4help, took part in one of their competitions and scored out of around 3000 competitors in the top one percent. This blogpost explains our approach to the problem and our key learnings.

01 Introduction — Problem Description

The project we chose is called Richter’s Predictor: Modeling Earthquake Damage. As the name suggests, the project involves predicting earthquake damages, specifically damage from the Gorkha earthquake which occurred in April 2015 and killed over 9,000 people. It represents the worst natural disaster to strike Nepal since the 1934 Nepal-Bihar earthquake.

Source: https://www.britannica.com/topic/Nepal-earthquake-of-2015

Our task in this project to forecast how badly an individual house is damaged, given the information about its location, secondary usage, and the materials used to build the house in the first place. The damage grade of each house is stated as an integer variable between one and three.

02 How to tackle the project — Plan of attack

The key to success in a Kaggle/ DrivenData challenge, just like in a data challenge for a job application, is a solid plan of attack. It is important that this plan is drafted as early as possible, since otherwise the project is likely to become headless and unstructured. This is especially problematic for data challenges for a job application, which generally serve to gauge whether a candidate can draft a solid strategy of the problem and execute it in short amount of time.

Therefore, one of the first things to do is to get a pen and paper and sketch out the problem. Afterwards, the toolkit for the prediction should be evaluated. That means we should investigate what kind of training data we have to solve the problem. A thorough analysis of the features is key for a high performance.

Do we have any missing values in the data? Do we have categorical variables and if so, what level of cardinality to we face? How sparse are the binary variables? Are the float/integer variables highly skewed? How is the location of a house defined? All these questions came up when we went through the data for the first time. It is important that all aspects are noted somewhere at this stage in order to prepare a structured approach.

After noting all the initial questions we have, the next step is to lay out a plan and define the order in which the problem is to be evaluated and solved. It is worth noting here that it is not expected to have a perfect solution for all the problems we can think off right at the beginning, but rather to consider potential problem areas that could arise.

03 Preliminaries & Base model

One of the first steps in any data challenge should be to train a benchmark model. This model should be as simple as possible and only minor feature engineering should be required. The importance of that model is that it gives us an indication of where our journey starts and what a sensible result is.

Given that DrivenData already set a benchmark using a Random Forest model, we will also use that model as a baseline. Before the data can be fed into the model, we have to take care of all categorical variables in the data, through the handy get_dummies command from Pandas. Secondly, we remove the variable building_id which is a randomly assigned variable for each building and hence does not carry any meaning.

train_values.drop(columns=["building_id"], inplace=True)
dummy_train = pd.get_dummies(train_values)

y_df = train_labels.loc[:, ["damage_grade"]]
X_df = dummy_train
model = model_dict["lgt"]
baseline_model = calc_score(model, X_df, y_df)

From the model_dict we then import the basic random forest model. With just these couple of lines of code, we have a baseline model and baseline accuracy of 71.21%. This is now our number to beat!

In the next sections, we show the steps taken to try to improve on this baseline.

04 Skewness of the integer variables

As one of the first steps in feature engineering for improving on this baseline, we will further investigate all float and integer variables of the dataset. To make all the numeric variables easier to access, we stored the names of all variables of each kind in a dictionary called variable_dict.

In order to better understand the variables, we plot all integer variables using the package matplotlib:

int_variables = variable_dict["int_variables"]
int_var_df = dummy_train.loc[:, int_variables]
fig, axs = plt.subplots(1, 5, figsize=(60, 10))
for number, ax in enumerate(axs.flat):
    sns.kdeplot(int_var_df.iloc[:, number], bw=1.5, ax=ax,
                shade=True, cbar="GnBu_d")
    ax.tick_params(axis="both", which="major", labelsize=30)
    ax.legend(fontsize=30, loc="upper right")
path = (r"{}\int.png".format(output_path))
fig.savefig(path, bbox_inches="tight")

As we can see from the graph above, all the plots exhibit an excessive rightward skew. That means that there are a few observations for each variable which are much higher than the rest of the data. Another way to describe this phenomena would be to say that the mean of the distribution is higher than the median.

As a refresher, skewness describes the symmetry of a distribution. A normal distribution has, as a reference, a skewness of zero, given its perfect symmetry. A high (or low) skewness results from having a few obscurely high (or low) observation in the data, which we sometimes also call outliers. The problem with outliers is manifold, but the most important problem for us is that it hurts the performance of nearly every prediction model, since it interferes with the loss function of the model.

One effective measure to dampen the massive disparity between the observations is to apply the natural logarithm. This is allowed since the logarithmic function represents a strictly monotonic transformation, meaning that the order of the data is not changed when log is applied.

Before being able to apply that measure, we have to deal with the zero values (the natural logarithm of zero is not defined). We do that by simply adding one to every observation before applying the logarithm. Lastly we standardize all variables to further improve our model performance.

# Applying the logs and create new sensible column names
logged_train = dummy_train.loc[:, int_variables]\
    .apply(lambda x: np.log(x+1))
log_names = ["log_{}".format(x) for x in int_variables]

stand_logs = StandardScaler().fit_transform(logged_train)
stand_logs_df = pd.DataFrame(stand_logs, columns=log_names) 
for log_col, int_col in zip(stand_logs_df, int_variables):
    dummy_train.loc[:, log_col] = stand_logs_df.loc[:, log_col]
    dummy_train.drop(columns=int_col, inplace=True)

# Plot the newly created plot log variables
fig, axs = plt.subplots(1, 5, figsize=(60, 10))
for number, ax in enumerate(axs.flat):
    sns.kdeplot(logged_train.iloc[:, number], bw=1.5, ax=ax,
                shade=True, cbar="GnBu_d")
    ax.tick_params(axis="both", which="major", labelsize=30)
    ax.legend(fontsize=30, loc="upper right")
path = (r"{}\logs_int.png".format(output_path))
fig.savefig(path, bbox_inches='tight')

The graph below shows the result of these operations. All distributions look much less skewed and do not exhibit the unwanted obscurely high values which we had before.

Before moving on, it is important for us to validate that our step taken had a positive effect on the overall performance of the model. We do that by quickly running the new data in our baseline random forest model. Our accuracy is now 73.14, which represents a slight improvement from our baseline model!

Our performance has increased. That tells us that we took a step in the right direction.

05 Geo Variables — Empirical Bayes Mean Encoding

Arguably the most important set of variables within this challenge is the information on where the house is located. This makes sense intuitively: if the house is located closer to the epicenter, than we would also expect a higher damage grade.

The set of location variables provided within this challenge are threefold. Namely, we get three different geo-identifier with different kind of granularity. For simplicity, we tended to regard the three different identifier as describing a town, district and street (see below).

These geo-identifiers in their initial state are given in a simple numeric format, as can be seen below.

These integer do not prove to by very useful since even though in a numeric format, they do not exhibit any correlation with the target (see graphic below). Meaning that a higher number of the identifier is not associated with higher or lower damage. This fact makes it difficult for the model to learn from this variable.

In order to create a more meaningful variable for the model to learn from these variables, we apply a powerful tool, oftentimes used in data science challenges, called encoding. Encoding is normally used when transforming categorical variables into a numeric format. On first glance we might think that this does not apply to our case, since the geo-identifier is given as a numeric variable. However this understanding of encoders is shortsighted, since whether something represents a categorical variable does not depend on the format, but on the interpretation of the variable. Hence, the variable could gain greatly in importance when undergoing a transformation!

There are a dozen different encoding methods, which are nicely summarized in this blogpost. The most promising method for our case would be something called target encoding. Target encoding replaces the categorical feature with the average target variable of this group.

Unfortunately, it is not that easy. This method may work fine for the first geo-identifier (town), but has some serious drawbacks for the more granular second and third geo-identifier (district and street). The reason is that there are multiple districts and streets which only occur in a very small frequency. In these cases, mean target variable of a group with a small sample size is not representative for the target distribution of the group as a whole and would therefore suffer from high variance as well as high bias. This problem is quite common when dealing with categorical variables with a high cardinality.

One workaround for this problem is a mixture between Empirical Bayes and the shrinkage methodology, motivated by paper [1]. Here, the mean of a subgroup is the weighted average of the mean target variable of the subgroup and the mean of the prior.

In our example that would mean that the encoded value for a certain street is the weighted average between the mean target variable of the observations of this street and the mean of the district this street is in. (one varaiable level higher). This method shrinks the importance of the potentially few observations for one street and takes the bigger picture into account, thereby reducing the overfitting problem shown before when we had only a couple of observations for a given street.

The question may now arise how we are determining the weighting factor lambda. Using the methodology of the paper in [1], lambda is defined as:

Where m is defined as the ratio of the variance within the group (street) divided by the variance of the main group (district). That formula makes intuitive sense when we consider a street with a few observations which differ massively in their damage grade. The mean damage grade of this street would therefore suffer from high bias and variance (high sigma). If this street is in a low variance district (low tau), it would be sensible to drag the mean of the street into the direction of the district. This is essentially what the m coefficient captures.

It is worth mentioning that the overall model performance in-sample will drop when applying the Empirical Bayes-shrinkage method compared to using a normal target encoder. This is not surprising since we were dealing with an overfitted model before.

Lastly, we run our model again in order to see whether our actions improved the overall model performance. The resulting F1 score of 76.01% tells us that our changes results in an overall improvement.

06 Feature selection

At this point, it is fair to ask ourselves whether we need all the variables we currently use in our prediction model. If possible, we would like to work with as few features as possible (parsimonious property) without losing out too much in our scoring variable.

One benefit of working with tree models is ability to display feature importance. This metrics indicates how important each feature is for our prediction making. The following code and graph displays the variables nicely.

fimportance = main_rmc_model["model"].feature_importances_
fimportance_df = pd.DataFrame()
fimportance_df.loc[:, "f_imp"] = fimportance
fimportance_df.loc[:, "col"] = dummy_train.columns
fimportance_df.sort_values(by="f_imp", ascending=False, inplace=True)

fig, ax = plt.subplots(1, 1, figsize=(12, 24))
ax = sns.barplot(x="f_imp", y="col",
                data=fimportance_df,
                palette="GnBu_d")
ax.tick_params(axis="both", which="major", labelsize=20)
ax.set_xlabel("Feature Importance in %", fontsize=24)
ax.set_ylabel("Features", fontsize=24)
path = (r"{}\feature_importance.png".format(output_path))
fig.savefig(path, bbox_inches='tight')

As we can see from the graph above, the most important variables to predict the damage grade of a house is the average damage grade of the different geo-locations. This makes sense, since the level of destruction of one house is likely to be correlated with the average damage of the houses around.

06.1 Low importance of binary variables

The feature importance also shows that nearly all binary variables have a low feature importance, meaning they are providing the model with little to no predictive information. In order to understand that better we take a look into the average of all binary variables, which is a number between zero and one.

binary_variables = variable_dict["binary_variables"]
mean_binary = pd.DataFrame(dummy_train.loc[:, binary_variables].mean())
mean_binary.loc[:, "type"] = mean_binary.index

fig, ax = plt.subplots(1, 1, figsize=(12, 12))
ax = sns.barplot(x="type", y=0,
                data=mean_binary,
                palette="GnBu_d")
ax.tick_params(axis="both", which="major", labelsize=16)
ax.set_xticklabels(mean_binary.loc[:, "type"], rotation=90)
path = (r"{}\binaries_mean.png".format(output_path))
fig.savefig(path, bbox_inches='tight')

As can be seen above, nearly all variables have a mean below ten percent. That implies that most rows are equal to zero, a phenomenon we normally describe as sparsity. Furthermore, it is visible that the binary variables with an average above ten percent have also a higher feature importance within our prediction model.

This finding is in line with the fact that tree models, and especially bagging models like the currently used Random Forest, do not work well with sparse data. Furthermore, it can be said that a binary variable which is nearly always zero (e.g. has_secondary_usage_school), simply does not carry that much meaning given the low correlation with the target.

Using cross-validation, we find that keeping features which have an importance of minimum 0.01%, leaves us with the same F1 score compared to using all features. This leaves us with 53 variables in total. This number, relative to the amount of rows we have (260k) seems reasonable and therefore appropriate for the task.

07 Imbalance of damage grades

One of our key learnings in this challenge was how to handle the massive imbalance of the target variable. Namely, not to touch it at all!

When looking at the chart below, we can see that the first damage grade does not nearly appear as often as the second damage grade. It may be tempting now to apply some over- or undersampling to the data in order to better balance the data and to show the model an equal amount of each damage grade. The main problem with this approach is that the test data comes from the same (imbalanced) distribution as the training data, meaning that improving the accuracy score for the lowest damage grade, through sampling methods, comes with the costs of a lower accuracy of the highest occurring, and therefore more important damage grade two.

07.1 Performance & Concluding Remarks

Following all steps of this blogpost (with a few minor tweaks and hyperparameter tuning) led us to place 34 out of 2861 competitors.

We are overall quite happy with the placement, given the amount of work we put in. This challenge touched on many different aspects and taught us a lot. Data Science Challenges are a perfect learning opportunity since they are very close to real life problems.

We are looking forward to our next one!

References

[1] Micci-Barreca, Daniele. (2001). A Preprocessing Scheme for High-Cardinality Categorical Attributes in Classification and Prediction Problems.. SIGKDD Explorations. 3. 27–32. 10.1145/507533.507538.

Don’t forget to give us your ? !

How to shine in a Data Science Challenge — An Example from DrivenData was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/how-to-shine-in-a-data-science-challenge-an-example-from-drivendata-47a526fa38ea?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/how-to-shine-in-a-data-science-challengean-example-from-drivendata

Machine Vision how AI brings value to industries

Enterprises are using multiple types of AI applications, with one in ten enterprises using ten or more. The most popular use cases are chatbots, process automation solutions and fraud analytics. Natural language and computer vision AI underpin many prevalent applications as companies embrace the ability to replicate traditionally human activities in software for the first time, according to MMC Ventures.

Nowadays AI, also as a buzz word, has dominated almost all technology related discussions. I can even risk a statement that there is not a single company in the world that has never considered placing AI at least in their <5 years roadmap planning. Moreover, we use it daily. Be it our smartphones or Amazon devices when we say “Call my wife” or “Alexa, open Pandora” or our TVs/Internet TV boxes while browsing through online streaming repositories and getting recommendations, Cars which display recently recognized road signs, conferencing systems that replace our backgrounds during the so-called “shelter at home” era and many more.

AI systems have traveled a long way since the first official workshops on the subject that were reportedly held around the mid-1950s. Since then and thanks to tremendous progress in many areas like new algorithms design, specialized hardware and cloud services becoming available, so-called data explosion enabling quality AI training, both open-source and proprietary software libraries development, growing investments, number of applications and an increased demand, AI has become a vital tool augmenting human capabilities across industries.

One such example where AI delivers value is through Machine Vision. Machine Vision or Computer Vision enables machines to identify objects, analyze scenes and activities in real-life visual environments. It does so by leveraging Deep Learning. Sometimes Deep Learning is supported by other techniques which in certain scenarios increase its effectiveness. In other words, thanks to all of these technologies or techniques cameras can see and notify people about i.e. detected fire or quality issues being diagnosed in production lines, count objects on conveyor belts, analyze medical images, monitor buildings and inspect construction areas, or even guide robotic arms through various motions. If something can be captured on a picture or a video, chances are machines can be trained to analyze and identify it as well.

Computer Vision (Machine Vision) enables machines to identify objects, analyze scenes and activities in real-life visual environments.

Don’t forget to give us your ? !

Machine Vision, how AI brings value to industries was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/machine-vision-how-ai-brings-value-to-industries-e6a4f8e56f42?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/machine-vision-how-ai-brings-value-to-industries

How Artificial Intelligence is Changing the Real Estate Industry

The Role of Artificial Intelligence in Real Estate — Image Source: http://www.hiddenbrains.com

Artificial intelligence or AI is used — often unnoticed by humans — in more and more areas. This is also the case in the real estate industry, where the capabilities of neural networks open up new opportunities for the sale, operation, and maintenance of buildings. The areas of application in the real estate industry are diverse. Four areas stand out in particular.

Neural networks can be roughly differentiated by the extent to which they take on tasks autonomously. In some areas, AI acts without human intervention; in others, it merely supports him in various activities. Systems of different types can also be found in the real estate industry. The solutions presented below are already in use or ready for the market within the next few years.

But why AI? Homo sapiens is good at making connections between “obvious” facts. For most of us, the following argument should seem logical: “People with children buy a property with a garden more often.” Artificial intelligence is powerful because it does not perceive the world in three dimensions without social imprint. Thanks to machine learning, it sees and processes information fundamentally differently than living beings.

As a result, neural networks are increasingly able to establish connections between aspects that are less obvious to humans: “People with children aged 2–4 are most likely to buy a property with a garden on Tuesdays when it is not raining. When the outside temperature is more than 18 degrees. “

AI is more than Siri and Alexa

Everyday life shows part of the answer: programs recognize faces and unlock the smartphone, Siri and Alexa organize our daily life and cars drive (partially) autonomously. AI is a helpful, self-learning technique for assessing risks, organizing vast amounts of data, finding solutions, and making processes more efficient.

Organizing data or finding solutions is a significant advantage for the real estate industry, which generally produces vast amounts of data. But that requires intensive preparatory work.

Before a machine or a program is even able to learn by itself, these artificial helpers must first be “trained.” Before an application can automatically read and evaluate rental agreements, the system must first be programmed. The quality of the output, therefore, depends entirely on the input of the human operator.

Artificial intelligence accelerates document review

The advantages can, therefore, be seen quickly. Europe seems to be more progressive in throwing cherished traditions overboard, provided that it serves to improve efficiency.

Artificial intelligence in real estate management and real estate software solutions

The above approach to building automation can also be used in the area of property management. When is there a high probability of a need for renovation in which buildings? In which properties are high personnel costs to be expected in the coming quarter? Depending on the data collected, artificial intelligence solutions will make it easier to answer these questions about property management in the future.

Another possible application in this area is on the user’s side. An example: The tenant of an apartment detects damage in the property. With the help of a virtual assistant, she transmits information about the deficiency to facility management. An AI automatically processes the transmitted data to be used immediately by an employee for further processing.

Automation in building security

Neural networks are also increasingly being used for surveillance, as they — unlike a single security guard — can process an unlimited amount of video signals. Image processing systems recognize when a camera in the building is recording a movement and, in a matter of seconds, discuss whether it is a matter of people, animals, or an object.

If it is a person, facial recognition is used to determine whether they are in the monitored area without authorization. If this is the case, the porter will be informed. Still other systems recognize, for example, at airports, whether a person is carrying dangerous or prohibited objects. Shoplifters watch out: Artificial intelligence should now also be able to register suspicious behavior,

Conclusion

Before an artificial intelligence can independently recognize documents such as a building permit or a lease, goals must be defined first. The software is “trained” with the data on these goals. The advantages of artificial intelligence in the real estate sector are only fully exploited if the quality of the algorithms developed is right, so the input ultimately determines the quality of the output.

Don’t forget to give us your ? !

How Artificial Intelligence is Changing the Real Estate Industry was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/how-artificial-intelligence-is-changing-the-real-estate-industry-298362efdc37?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/how-artificial-intelligence-is-changing-the-real-estate-industry

Goodharts Law for Data Science and what happens when a measure becomes a target?

When developing analytics and algorithms to better understand a business target, unintended biases can sneak in that ensure desired outcomes are obtained. Guiding your work with multiple metrics in mind can help avoid such consequences of Goodhart’s Law.

Originally from KDnuggets https://ift.tt/2H1SHHh

source https://365datascience.weebly.com/the-best-data-science-blog-2020/goodharts-law-for-data-science-and-what-happens-when-a-measure-becomes-a-target

What Is a Business Analyst and How Do They Fit Inside a Company?

What Does a Business Analyst Do?

What Is a Business Analyst Salary?

What Is the Business Analyst Career Path?

What Business Analyst Skills You Need to Acquire?

What Is the Required Business Analyst Degree?

What Is the Necessary Business Analyst Experience?

Richter’s Predictor— Data Challenge from DrivenData

01 Introduction — Problem Description

02 How to tackle the project — Plan of attack

Trending AI Articles:

03 Preliminaries & Base model

04 Skewness of the integer variables

05 Geo Variables — Empirical Bayes Mean Encoding

06 Feature selection

06.1 Low importance of binary variables

07 Imbalance of damage grades

07.1 Performance & Concluding Remarks

References

Don’t forget to give us your ? !

Trending AI Articles:

Don’t forget to give us your ? !

AI is more than Siri and Alexa

Artificial intelligence accelerates document review

Trending AI Articles:

Artificial intelligence as a real estate agent

AI in building automation

Artificial intelligence in real estate management and real estate software solutions

Automation in building security

Conclusion

Don’t forget to give us your ? !