Saturday, 8 March 2014

OPH 111 . FUNDAMENTAL PHYSICS ---BY. MWL. JAPHET MASATU

Fundamentals of Physics

Fundamentals of Physics
The cover page of Fundamentals of Physics Extended 9th edition.
Authors	David Halliday, Robert Resnick, Jearl Walker
Country	United States of America
Language	American English
Subject	Physics
Genre	Textbook
Published	1960 (John Wiley & Sons, Inc.)
Media type	Print (hardcover)

Fundamentals of Physics is a calculus-based physics textbook by David Halliday, Robert Resnick, and Jearl Walker. ^[1] The textbook is currently in its tenth edition and is published in a five-volume set. The current version is a revised version of the original textbook Physics by Halliday and Resnick, first published in 1960. It is widely used in colleges as part of the undergraduate physics courses, and has been well known to science and engineering students for decades as "the gold standard" of freshman-level physics texts. In 2002, the American Physical Society named the work the most outstanding introductory physics text of the 20th century.
The textbook covers most of the basic topics in physics:

The extended edition also contains introductions to topics such as quantum mechanics, atomic theory, solid-state physics, nuclear physics and cosmology. A solutions manual and a study guide are also available. ^[

From Wikipedia, the free encyclopedia

This article is about an area of scientific study. For other uses, see Mechanic (disambiguation). MECHA

Thermodynamics

From Wikipedia, the free encyclopedia

Annotated color version of the original 1824 Carnot heat engine showing the hot body (boiler), working body (system, steam), and cold body (water), the letters labeled according to the stopping points in Carnot cycle

Thermodynamics
The classical Carnot heat engine
Branches[show]
Laws [show]
Systems [show]
System properties [show]
Material properties [show]
Equations [show]
Potentials [show]
History / Culture[show]
Scientists[show]
Book:Thermodynamics
v t e

Thermodynamics is a branch of natural science concerned with heat and temperature and their relation to energy and work. It defines macroscopic variables, such as internal energy, entropy, and pressure, that partly describe a body of matter or radiation. It states that the behavior of those variables is subject to general constraints, that are common to all materials, not the peculiar properties of particular materials. These general constraints are expressed in the four laws of thermodynamics. Thermodynamics describes the bulk behavior of the body, not the microscopic behaviors of the very large numbers of its microscopic constituents, such as molecules. Its laws are explained by statistical mechanics, in terms of the microscopic constituents.
Thermodynamics applies to a wide variety of topics in science and engineering.
Historically, thermodynamics developed out of a desire to increase the efficiency and power output of early steam engines, particularly through the work of French physicist Nicolas Léonard Sadi Carnot (1824) who believed that the efficiency of heat engines was the key that could help France win the Napoleonic Wars.^[1] Irish-born British physicist Lord Kelvin was the first to formulate a concise definition of thermodynamics in 1854:^[2]

"Thermo-dynamics is the subject of the relation of heat to forces acting between contiguous parts of bodies, and the relation of heat to electrical agency."

Initially, thermodynamics, as applied to heat engines, was concerned with the thermal properties of their 'working materials' such as steam, in an effort to increase the efficiency and power output of engines. Thermodynamics later expanded to the study of energy transfers in chemical processes, for example to the investigation, published in 1840, of the heats of chemical reactions^[3] by Germain Hess, which was not originally explicitly concerned with the relation between energy exchanges by heat and work. From this evolved the study of Chemical thermodynamics and the role of entropy in chemical reactions.^[4]^[5]^[6]^[7]^[8]^[9]^[10]^[11]^[12]

Introduction

The plain term 'thermodynamics' refers to a macroscopic description of bodies and processes.^[13] "Any reference to atomic constitution is foreign to classical thermodynamics."^[14] The qualified term 'statistical thermodynamics' refers to descriptions of bodies and processes in terms of the atomic constitution of matter, mainly described by sets of items all alike, so as to have equal probabilities.
Thermodynamics arose from the study of two distinct kinds of transfer of energy, as heat and as work, and the relation of those to the system's macroscopic variables of volume, pressure and temperature.^[15]^[16]
Thermodynamic equilibrium is one of the most important concepts for thermodynamics.^[17] The temperature of a thermodynamic system is well defined, and is perhaps the most characteristic quantity of thermodynamics. As the systems and processes of interest are taken further from thermodynamic equilibrium, their exact thermodynamical study becomes more difficult. Relatively simple approximate calculations, however, using the variables of equilibrium thermodynamics, are of much practical value. In many important practical cases, as in heat engines or refrigerators, the systems consist of many subsystems at different temperatures and pressures. In practice, thermodynamic calculations deal effectively with these complicated dynamic systems provided the equilibrium thermodynamic variables are nearly enough well-defined.
Central to thermodynamic analysis are the definitions of the system, which is of interest, and of its surroundings.^[8]^[18] The surroundings of a thermodynamic system consist of physical devices and of other thermodynamic systems that can interact with it. An example of a thermodynamic surrounding is a heat bath, which is held at a prescribed temperature, regardless of how much heat might be drawn from it.
There are three fundamental kinds of physical entities in thermodynamics, states of a system, thermodynamic processes of a system, and thermodynamic operations. This allows two fundamental approaches to thermodynamic reasoning, that in terms of states of a system, and that in terms of cyclic processes of a system.
A thermodynamic system can be defined in terms of its states. In this way, a thermodynamic system is a macroscopic physical object, explicitly specified in terms of macroscopic physical and chemical variables that describe its macroscopic properties. The macroscopic state variables of thermodynamics have been recognized in the course of empirical work in physics and chemistry.^[9]
A thermodynamic operation is an artificial physical manipulation that changes the definition of a system or its surroundings. Usually it is a change of the permeability or some other feature of a wall of the system.,^[19] that allows energy (as heat or work) or matter (mass) to be exchanged with the environment For example, the partition between two thermodynamic systems can be removed so as to produce a single system. A thermodynamic operation usually leads to a thermodynamic process of transfer of mass or energy that changes the state of the system, and the transfer occurs in natural accord with the laws of thermodynamics. Thermodynamic operations are not the only initiators of thermodynamic processes. Also of course changes in the intensive or extensive variables of the surroundings can initiate thermodynamic processes.
A thermodynamic system can also be defined in terms of the cyclic processes that it can undergo.^[20] A cyclic process is a cyclic sequence of thermodynamic operations and processes that can be repeated indefinitely often without changing the final state of the system.
For thermodynamics and statistical thermodynamics to apply to a system subjected to a process, it is necessary that the atomic mechanisms of the process fall into one of two classes:

those so rapid that, in the time frame of the process of interest, the atomic states effectively visit all of their accessible range, bringing the system to its state of internal thermodynamic equilibrium; and
those so slow that their progress can be neglected in the time frame of the process of interest.^[21]^[22]

The rapid atomic mechanisms represent the internal energy of the system. They mediate the macroscopic changes that are of interest for thermodynamics and statistical thermodynamics, because they quickly bring the system near enough to thermodynamic equilibrium. "When intermediate rates are present, thermodynamics and statistical mechanics cannot be applied."^[21] Such intermediate rate atomic processes do not bring the system near enough to thermodynamic equilibrium in the time frame of the macroscopic process of interest. This separation of time scales of atomic processes is a theme that recurs throughout the subject.
For example, classical thermodynamics is characterized by its study of materials that have equations of state or characteristic equations. They express equilibrium relations between macroscopic mechanical variables and temperature and internal energy. They express the constitutive peculiarities of the material of the system. A classical material can usually be described by a function that makes pressure dependent on volume and temperature, the resulting pressure being established much more rapidly than any imposed change of volume or temperature.^[23]^[24]^[25]^[26]
The present article takes a gradual approach to the subject, starting with a focus on cyclic processes and thermodynamic equilibrium, and then gradually beginning to further consider non-equilibrium systems.
Thermodynamic facts can often be explained by viewing macroscopic objects as assemblies of very many microscopic or atomic objects that obey Hamiltonian dynamics.^[8]^[27]^[28] The microscopic or atomic objects exist in species, the objects of each species being all alike. Because of this likeness, statistical methods can be used to account for the macroscopic properties of the thermodynamic system in terms of the properties of the microscopic species. Such explanation is called statistical thermodynamics; also often it is referred to by the term 'statistical mechanics', though this term can have a wider meaning, referring to 'microscopic objects', such as economic quantities, that do not obey Hamiltonian dynamics.^[27]

History

The thermodynamicists representative of the original eight founding schools of thermodynamics. The schools with the most-lasting effect in founding the modern versions of thermodynamics are the Berlin school, particularly as established in Rudolf Clausius’s 1865 textbook The Mechanical Theory of Heat, the Vienna school, with the statistical mechanics of Ludwig Boltzmann, and the Gibbsian school at Yale University, American engineer Willard Gibbs' 1876 On the Equilibrium of Heterogeneous Substances launching chemical thermodynamics.

The history of thermodynamics as a scientific discipline generally begins with Otto von Guericke who, in 1650, built and designed the world's first vacuum pump and demonstrated a vacuum using his Magdeburg hemispheres. Guericke was driven to make a vacuum in order to disprove Aristotle's long-held supposition that 'nature abhors a vacuum'. Shortly after Guericke, the physicist and chemist Robert Boyle had learned of Guericke's designs and, in 1656, in coordination with scientist Robert Hooke, built an air pump.^[29] Using this pump, Boyle and Hooke noticed a correlation between pressure, temperature, and volume. In time, Boyle's Law was formulated, stating that for a gas at constant temperature, its pressure and volume are inversely proportional. In 1679, based on these concepts, an associate of Boyle's named Denis Papin built a steam digester, which was a closed vessel with a tightly fitting lid that confined steam until a high pressure was generated.
Later designs implemented a steam release valve that kept the machine from exploding. By watching the valve rhythmically move up and down, Papin conceived of the idea of a piston and a cylinder engine. He did not, however, follow through with his design. Nevertheless, in 1697, based on Papin's designs, engineer Thomas Savery built the first engine, followed by Thomas Newcomen in 1712. Although these early engines were crude and inefficient, they attracted the attention of the leading scientists of the time.
The concepts of heat capacity and latent heat, which were necessary for development of thermodynamics, were developed by professor Joseph Black at the University of Glasgow, where James Watt worked as an instrument maker. Watt consulted with Black on tests of his steam engine, but it was Watt who conceived the idea of the external condenser, greatly raising the steam engine's efficiency.^[30] Drawing on all the previous work led Sadi Carnot, the "father of thermodynamics", to publish Reflections on the Motive Power of Fire (1824), a discourse on heat, power, energy and engine efficiency. The paper outlined the basic energetic relations between the Carnot engine, the Carnot cycle, and motive power. It marked the start of thermodynamics as a modern science.^[11]
The first thermodynamic textbook was written in 1859 by William Rankine, originally trained as a physicist and a civil and mechanical engineering professor at the University of Glasgow.^[31] The first and second laws of thermodynamics emerged simultaneously in the 1850s, primarily out of the works of William Rankine, Rudolf Clausius, and William Thomson (Lord Kelvin).
The foundations of statistical thermodynamics were set out by physicists such as James Clerk Maxwell, Ludwig Boltzmann, Max Planck, Rudolf Clausius and J. Willard Gibbs.
From 1873 to '76, the American mathematical physicist Josiah Willard Gibbs published a series of three papers, the most famous being "On the equilibrium of heterogeneous substances".^[4] Gibbs showed how thermodynamic processes, including chemical reactions, could be graphically analyzed. By studying the energy, entropy, volume, chemical potential, temperature and pressure of the thermodynamic system, one can determine if a process would occur spontaneously.^[32] Chemical thermodynamics was further developed by Pierre Duhem,^[5] Gilbert N. Lewis, Merle Randall,^[6] and E. A. Guggenheim,^[7]^[8] who applied the mathematical methods of Gibbs.

The lifetimes of some of the most important contributors to thermodynamics.

Etymology

The etymology of thermodynamics has an intricate history. It was first spelled in a hyphenated form as an adjective (thermo-dynamic) and from 1854 to 1868 as the noun thermo-dynamics to represent the science of generalized heat engines.
The components of the word thermo-dynamic are derived from the Greek words θέρμη therme, meaning "heat," and δύναμις dynamis, meaning "power" (Haynie claims that the word was coined around 1840).^[33]^[34]
Pierre Perrot claims that the term thermodynamics was coined by James Joule in 1858 to designate the science of relations between heat and power.^[11] Joule, however, never used that term, but used instead the term perfect thermo-dynamic engine in reference to Thomson’s 1849^[35] phraseology.
By 1858, thermo-dynamics, as a functional term, was used in William Thomson's paper An Account of Carnot's Theory of the Motive Power of Heat.^[35]

Branches of description

Thermodynamic systems are theoretical constructions used to model physical systems that exchange matter and energy in terms of the laws of thermodynamics. The study of thermodynamical systems has developed into several related branches, each using a different fundamental model as a theoretical or experimental basis, or applying the principles to varying types of systems.

Classical thermodynamics

Classical thermodynamics accounts for the adventures of a thermodynamic system in terms, either of its time-invariant equilibrium states, or else of its continually repeated cyclic processes, but, formally, not both in the same account. It uses only time-invariant, or equilibrium, macroscopic quantities measureable in the laboratory, counting as time-invariant a long-term time-average of a quantity, such as a flow, generated by a continually repetitive process.^[36]^[37] In classical thermodynamics, rates of change are not admitted as variables of interest. An equilibrium state stands endlessly without change over time, while a continually repeated cyclic process runs endlessly without a net change on the system over time.
In the account in terms of equilibrium states of a system, a state of thermodynamic equilibrium in a simple system is spatially homogeneous.
In the classical account solely in terms of a cyclic process, the spatial interior of the 'working body' of that process is not considered; the 'working body' thus does not have a defined internal thermodynamic state of its own because no assumption is made that it should be in thermodynamic equilibrium; only its inputs and outputs of energy as heat and work are considered.^[38] It is common to describe a cycle theoretically as composed of a sequence of very many thermodynamic operations and processes. This creates a link to the description in terms of equilibrium states. The cycle is then theoretically described as a continuous progression of equilibrium states.
Classical thermodynamics was originally concerned with the transformation of energy in a cyclic process, and the exchange of energy between closed systems defined only by their equilibrium states. The distinction between transfers of energy as heat and as work was central.
As classical thermodynamics developed, the distinction between heat and work became less central. This was because there was more interest in open systems, for which the distinction between heat and work is not simple, and is beyond the scope of the present article. Alongside the amount of heat transferred as a fundamental quantity, entropy was gradually found to be a more generally applicable concept, especially when considering chemical reactions. Massieu in 1869 considered entropy as the basic dependent thermodynamic variable, with energy potentials and the reciprocal of the thermodynamic temperature as fundamental independent variables. Massieu functions can be useful in present-day non-equilibrium thermodynamics. In 1875, in the work of Josiah Willard Gibbs, entropy was considered a fundamental independent variable, while internal energy was a dependent variable.^[39]
All actual physical processes are to some degree irreversible. Classical thermodynamics can consider irreversible processes, but its account in exact terms is restricted to variables that refer only to initial and final states of thermodynamic equilibrium, or to rates of input and output that do not change with time. For example, classical thermodynamics can consider time-average rates of flows generated by continually repeated irreversible cyclic processes. Also it can consider irreversible changes between equilibrium states of systems consisting of several phases (as defined below in this article), or with removable or replaceable partitions. But for systems that are described in terms of equilibrium states, it considers neither flows, nor spatial inhomogeneities in simple systems with no externally imposed force fields such as gravity. In the account in terms of equilibrium states of a system, descriptions of irreversible processes refer only to initial and final static equilibrium states; the time it takes to change thermodynamic state is not considered.^[40]^[41]

Local equilibrium thermodynamics

Local equilibrium thermodynamics is concerned with the time courses and rates of progress of irreversible processes in systems that are smoothly spatially inhomogeneous. It admits time as a fundamental quantity, but only in a restricted way. Rather than considering time-invariant flows as long-term-average rates of cyclic processes, local equilibrium thermodynamics considers time-varying flows in systems that are described by states of local thermodynamic equilibrium, as follows.
For processes that involve only suitably small and smooth spatial inhomogeneities and suitably small changes with time, a good approximation can be found through the assumption of local thermodynamic equilibrium. Within the large or global region of a process, for a suitably small local region, this approximation assumes that a quantity known as the entropy of the small local region can be defined in a particular way. That particular way of definition of entropy is largely beyond the scope of the present article, but here it may be said that it is entirely derived from the concepts of classical thermodynamics; in particular, neither flow rates nor changes over time are admitted into the definition of the entropy of the small local region. It is assumed without proof that the instantaneous global entropy of a non-equilibrium system can be found by adding up the simultaneous instantaneous entropies of its constituent small local regions. Local equilibrium thermodynamics considers processes that involve the time-dependent production of entropy by dissipative processes, in which kinetic energy of bulk flow and chemical potential energy are converted into internal energy at time-rates that are explicitly accounted for. Time-varying bulk flows and specific diffusional flows are considered, but they are required to be dependent variables, derived only from material properties described only by static macroscopic equilibrium states of small local regions. The independent state variables of a small local region are only those of classical thermodynamics.

Generalized or extended thermodynamics

Like local equilibrium thermodynamics, generalized or extended thermodynamics also is concerned with the time courses and rates of progress of irreversible processes in systems that are smoothly spatially inhomogeneous. It describes time-varying flows in terms of states of suitably small local regions within a global region that is smoothly spatially inhomogeneous, rather than considering flows as time-invariant long-term-average rates of cyclic processes. In its accounts of processes, generalized or extended thermodynamics admits time as a fundamental quantity in a more far-reaching way than does local equilibrium thermodynamics. The states of small local regions are defined by macroscopic quantities that are explicitly allowed to vary with time, including time-varying flows. Generalized thermodynamics might tackle such problems as ultrasound or shock waves, in which there are strong spatial inhomogeneities and changes in time fast enough to outpace a tendency towards local thermodynamic equilibrium. Generalized or extended thermodynamics is a diverse and developing project, rather than a more or less completed subject such as is classical thermodynamics.^[42]^[43]
For generalized or extended thermodynamics, the definition of the quantity known as the entropy of a small local region is in terms beyond those of classical thermodynamics; in particular, flow rates are admitted into the definition of the entropy of a small local region. The independent state variables of a small local region include flow rates, which are not admitted as independent variables for the small local regions of local equilibrium thermodynamics.
Outside the range of classical thermodynamics, the definition of the entropy of a small local region is no simple matter. For a thermodynamic account of a process in terms of the entropies of small local regions, the definition of entropy should be such as to ensure that the second law of thermodynamics applies in each small local region. It is often assumed without proof that the instantaneous global entropy of a non-equilibrium system can be found by adding up the simultaneous instantaneous entropies of its constituent small local regions. For a given physical process, the selection of suitable independent local non-equilibrium macroscopic state variables for the construction of a thermodynamic description calls for qualitative physical understanding, rather than being a simply mathematical problem concerned with a uniquely determined thermodynamic description. A suitable definition of the entropy of a small local region depends on the physically insightful and judicious selection of the independent local non-equilibrium macroscopic state variables, and different selections provide different generalized or extended thermodynamical accounts of one and the same given physical process. This is one of the several good reasons for considering entropy as an epistemic physical variable, rather than as a simply material quantity. According to a respected author: "There is no compelling reason to believe that the classical thermodynamic entropy is a measurable property of nonequilibrium phenomena, ..."^[44]

Statistical thermodynamics

Statistical thermodynamics, also called statistical mechanics, emerged with the development of atomic and molecular theories in the second half of the 19th century and early 20th century. It provides an explanation of classical thermodynamics. It considers the microscopic interactions between individual particles and their collective motions, in terms of classical or of quantum mechanics. Its explanation is in terms of statistics that rest on the fact the system is composed of several species of particles or collective motions, the members of each species respectively being in some sense all alike.

Thermodynamic equilibrium

Equilibrium thermodynamics studies transformations of matter and energy in systems at or near thermodynamic equilibrium. In thermodynamic equilibrium, a system's properties are, by definition, unchanging in time. In thermodynamic equilibrium no macroscopic change is occurring or can be triggered; within the system, every microscopic process is balanced by its opposite; this is called the principle of detailed balance. A central aim in equilibrium thermodynamics is: given a system in a well-defined initial state, subject to specified constraints, to calculate what the equilibrium state of the system is.^[45]
In theoretical studies, it is often convenient to consider the simplest kind of thermodynamic system. This is defined variously by different authors.^[40]^[46]^[47]^[48]^[49]^[50] For the present article, the following definition is convenient, as abstracted from the definitions of various authors. A region of material with all intensive properties continuous in space and time is called a phase. A simple system is for the present article defined as one that consists of a single phase of a pure chemical substance, with no interior partitions.
Within a simple isolated thermodynamic system in thermodynamic equilibrium, in the absence of externally imposed force fields, all properties of the material of the system are spatially homogeneous.^[51] Much of the basic theory of thermodynamics is concerned with homogeneous systems in thermodynamic equilibrium.^[4]^[52]
Most systems found in nature or considered in engineering are not in thermodynamic equilibrium, exactly considered. They are changing or can be triggered to change over time, and are continuously and discontinuously subject to flux of matter and energy to and from other systems.^[53] For example, according to Callen, "in absolute thermodynamic equilibrium all radioactive materials would have decayed completely and nuclear reactions would have transmuted all nuclei to the most stable isotopes. Such processes, which would take cosmic times to complete, generally can be ignored.".^[53] Such processes being ignored, many systems in nature are close enough to thermodynamic equilibrium that for many purposes their behaviour can be well approximated by equilibrium calculations.

Quasi-static transfers between simple systems are nearly in thermodynamic equilibrium and are reversible

It very much eases and simplifies theoretical thermodynamical studies to imagine transfers of energy and matter between two simple systems that proceed so slowly that at all times each simple system considered separately is near enough to thermodynamic equilibrium. Such processes are sometimes called quasi-static and are near enough to being reversible.^[54]^[55]

Natural processes are partly described by tendency towards thermodynamic equilibrium and are irreversible

If not initially in thermodynamic equilibrium, simple isolated thermodynamic systems, as time passes, tend to evolve naturally towards thermodynamic equilibrium. In the absence of externally imposed force fields, they become homogeneous in all their local properties. Such homogeneity is an important characteristic of a system in thermodynamic equilibrium in the absence of externally imposed force fields.
Many thermodynamic processes can be modeled by compound or composite systems, consisting of several or many contiguous component simple systems, initially not in thermodynamic equilibrium, but allowed to transfer mass and energy between them. Natural thermodynamic processes are described in terms of a tendency towards thermodynamic equilibrium within simple systems and in transfers between contiguous simple systems. Such natural processes are irreversible.^[56]

Non-equilibrium thermodynamics

Non-equilibrium thermodynamics^[57] is a branch of thermodynamics that deals with systems that are not in thermodynamic equilibrium; it is also called thermodynamics of irreversible processes. Non-equilibrium thermodynamics is concerned with transport processes and with the rates of chemical reactions.^[58] Non-equilibrium systems can be in stationary states that are not homogeneous even when there is no externally imposed field of force; in this case, the description of the internal state of the system requires a field theory.^[59]^[60]^[61] One of the methods of dealing with non-equilibrium systems is to introduce so-called 'internal variables'. These are quantities that express the local state of the system, besides the usual local thermodynamic variables; in a sense such variables might be seen as expressing the 'memory' of the materials. Hysteresis may sometimes be described in this way. In contrast to the usual thermodynamic variables, 'internal variables' cannot be controlled by external manipulations.^[62] This approach is usually unnecessary for gases and liquids, but may be useful for solids.^[63] Many natural systems still today remain beyond the scope of currently known macroscopic thermodynamic methods.

Laws of thermodynamics

Main article: Laws of thermodynamics

Thermodynamics states a set of four laws that are valid for all systems that fall within the constraints implied by each. In the various theoretical descriptions of thermodynamics these laws may be expressed in seemingly differing forms, but the most prominent formulations are the following:

Zeroth law of thermodynamics: If two systems are each in thermal equilibrium with a third, they are also in thermal equilibrium with each other.

This statement implies that thermal equilibrium is an equivalence relation on the set of thermodynamic systems under consideration. Systems are said to be in thermal equilibrium with each other if spontaneous molecular thermal energy exchanges between them do not lead to a net exchange of energy. This law is tacitly assumed in every measurement of temperature. For two bodies known to be at the same temperature, deciding if they are in thermal equilibrium when put into thermal contact does not require actually bringing them into contact and measuring any changes of their observable properties in time.^[64] In traditional statements, the law provides an empirical definition of temperature and justification for the construction of practical thermometers. In contrast to absolute thermodynamic temperatures, empirical temperatures are measured just by the mechanical properties of bodies, such as their volumes, without reliance on the concepts of energy, entropy or the first, second, or third laws of thermodynamics.^[48]^[65] Empirical temperatures lead to calorimetry for heat transfer in terms of the mechanical properties of bodies, without reliance on mechanical concepts of energy.
The physical content of the zeroth law has long been recognized. For example, Rankine in 1853 defined temperature as follows: "Two portions of matter are said to have equal temperatures when neither tends to communicate heat to the other."^[66] Maxwell in 1872 stated a "Law of Equal Temperatures".^[67] He also stated: "All Heat is of the same kind."^[68] Planck explicitly assumed and stated it in its customary present-day wording in his formulation of the first two laws.^[69] By the time the desire arose to number it as a law, the other three had already been assigned numbers, and so it was designated the zeroth law.

First law of thermodynamics: The increase in internal energy of a closed system is equal to the difference of the heat supplied to the system and the work done by it: ΔU = Q - W ^[70]^[71]^[72]^[73]^[74]^[75]^[76]^[77]^[78]^[79]^[80]

The first law of thermodynamics asserts the existence of a state variable for a system, the internal energy, and tells how it changes in thermodynamic processes. The law allows a given internal energy of a system to be reached by any combination of heat and work. It is important that internal energy is a variable of state of the system (see Thermodynamic state) whereas heat and work are variables that describe processes or changes of the state of systems.
The first law observes that the internal energy of an isolated system obeys the principle of conservation of energy, which states that energy can be transformed (changed from one form to another), but cannot be created or destroyed.^[81]^[82]^[83]^[84]^[85]

Second law of thermodynamics: Heat cannot spontaneously flow from a colder location to a hotter location.

The second law of thermodynamics is an expression of the universal principle of dissipation of kinetic and potential energy observable in nature. The second law is an observation of the fact that over time, differences in temperature, pressure, and chemical potential tend to even out in a physical system that is isolated from the outside world. Entropy is a measure of how much this process has progressed. The entropy of an isolated system that is not in equilibrium tends to increase over time, approaching a maximum value at equilibrium.
In classical thermodynamics, the second law is a basic postulate applicable to any system involving heat energy transfer; in statistical thermodynamics, the second law is a consequence of the assumed randomness of molecular chaos. There are many versions of the second law, but they all have the same effect, which is to explain the phenomenon of irreversibility in nature.

Third law of thermodynamics: As a system approaches absolute zero the entropy of the system approaches a minimum value.

The third law of thermodynamics is a statistical law of nature regarding entropy and the impossibility of reaching absolute zero of temperature. This law provides an absolute reference point for the determination of entropy. The entropy determined relative to this point is the absolute entropy. Alternate definitions are, "the entropy of all systems and of all states of a system is smallest at absolute zero," or equivalently "it is impossible to reach the absolute zero of temperature by any finite number of processes".
Absolute zero is −273.15 °C (degrees Celsius), or −459.67 °F (degrees Fahrenheit) or 0 K (kelvin).

System models

Types of transfers permitted
type of partition	type of transfer
	Mass and energy	Work	Heat
permeable to matter	+	0	0
permeable to energy but impermeable to matter	0	+	+
adiabatic	0	+	0
adynamic and impermeable to matter	0	0	+
isolating	0	0	0

A diagram of a generic thermodynamic system

An important concept in thermodynamics is the thermodynamic system, a precisely defined region of the universe under study. Everything in the universe except the system is known as the surroundings. A system is separated from the remainder of the universe by a boundary, which may be actual, or merely notional and fictive, but by convention delimits a finite volume. Transfers of work, heat, or matter between the system and the surroundings take place across this boundary. The boundary may or may not have properties that restrict what can be transferred across it. A system may have several distinct boundary sectors or partitions separating it from the surroundings, each characterized by how it restricts transfers, and being permeable to its characteristic transferred quantities.
The volume can be the region surrounding a single atom resonating energy, as Max Planck defined in 1900;^{[citation needed]} it can be a body of steam or air in a steam engine, such as Sadi Carnot defined in 1824; it can be the body of a tropical cyclone, such as Kerry Emanuel theorized in 1986 in the field of atmospheric thermodynamics; it could also be just one nuclide (i.e. a system of quarks) as hypothesized in quantum thermodynamics.
Anything that passes across the boundary needs to be accounted for in a proper transfer balance equation. Thermodynamics is largely about such transfers.
Boundary sectors are of various characters: rigid, flexible, fixed, moveable, actually restrictive, and fictive or not actually restrictive. For example, in an engine, a fixed boundary sector means the piston is locked at its position; then no pressure-volume work is done across it. In that same engine, a moveable boundary allows the piston to move in and out, permitting pressure-volume work. There is no restrictive boundary sector for the whole earth including its atmosphere, and so roughly speaking, no pressure-volume work is done on or by the whole earth system. Such a system is sometimes said to be diabatically heated or cooled by radiation.^[86]^[87]
Thermodynamics distinguishes classes of systems by their boundary sectors.

An open system has a boundary sector that is permeable to matter; such a sector is usually permeable also to energy, but the energy that passes cannot in general be uniquely sorted into heat and work components. Open system boundaries may be either actually restrictive, or else non-restrictive.
A closed system has no boundary sector that is permeable to matter, but in general its boundary is permeable to energy. For closed systems, boundaries are totally prohibitive of matter transfer.
An adiabatically isolated system has only adiabatic boundary sectors. Energy can be transferred as work, but transfers of matter and of energy as heat are prohibited.
A purely diathermically isolated system has only boundary sectors permeable only to heat; it is sometimes said to be adynamically isolated and closed to matter transfer. A process in which no work is transferred is sometimes called adynamic.^[88]
An isolated system has only isolating boundary sectors. Nothing can be transferred into or out of it.

Engineering and natural processes are often described as composites of many different component simple systems, sometimes with unchanging or changing partitions between them. A change of partition is an example of a thermodynamic operation.

States and processes

There are three fundamental kinds of entity in thermodynamics, states of a system, processes of a system, and thermodynamic operations. This allows three fundamental approaches to thermodynamic reasoning, that in terms of states of thermodynamic equilibrium of a system, and that in terms of time-invariant processes of a system, and that in terms of cyclic processes of a system.
The approach through states of thermodynamic equilibrium of a system requires a full account of the state of the system as well as a notion of process from one state to another of a system, but may require only an idealized or partial account of the state of the surroundings of the system or of other systems.
The method of description in terms of states of thermodynamic equilibrium has limitations. For example, processes in a region of turbulent flow, or in a burning gas mixture, or in a Knudsen gas may be beyond "the province of thermodynamics".^[89]^[90]^[91] This problem can sometimes be circumvented through the method of description in terms of cyclic or of time-invariant flow processes. This is part of the reason why the founders of thermodynamics often preferred the cyclic process description.
Approaches through processes of time-invariant flow of a system are used for some studies. Some processes, for example Joule-Thomson expansion, are studied through steady-flow experiments, but can be accounted for by distinguishing the steady bulk flow kinetic energy from the internal energy, and thus can be regarded as within the scope of classical thermodynamics defined in terms of equilibrium states or of cyclic processes.^[36]^[92] Other flow processes, for example thermoelectric effects, are essentially defined by the presence of differential flows or diffusion so that they cannot be adequately accounted for in terms of equilibrium states or classical cyclic processes.^[93]^[94]
The notion of a cyclic process does not require a full account of the state of the system, but does require a full account of how the process occasions transfers of matter and energy between the principal system (which is often called the working body) and its surroundings, which must include at least two heat reservoirs at different known and fixed temperatures, one hotter than the principal system and the other colder than it, as well as a reservoir that can receive energy from the system as work and can do work on the system. The reservoirs can alternatively be regarded as auxiliary idealized component systems, alongside the principal system. Thus an account in terms of cyclic processes requires at least four contributory component systems. The independent variables of this account are the amounts of energy that enter and leave the idealized auxiliary systems. In this kind of account, the working body is often regarded as a "black box",^[95] and its own state is not specified. In this approach, the notion of a properly numerical scale of empirical temperature is a presupposition of thermodynamics, not a notion constructed by or derived from it.

Account in terms of states of thermodynamic equilibrium

When a system is at thermodynamic equilibrium under a given set of conditions of its surroundings, it is said to be in a definite thermodynamic state, which is fully described by its state variables.
If a system is simple as defined above, and is in thermodynamic equilibrium, and is not subject to an externally imposed force field, such as gravity, electricity, or magnetism, then it is homogeneous, that is say, spatially uniform in all respects.^[96]
In a sense, a homogeneous system can be regarded as spatially zero-dimensional, because it has no spatial variation.
If a system in thermodynamic equilibrium is homogeneous, then its state can be described by a few physical variables, which are mostly classifiable as intensive variables and extensive variables.^[8]^[27]^[61]^[97]^[98]
An intensive variable is one that is unchanged with the thermodynamic operation of scaling of a system.
An extensive variable is one that simply scales with the scaling of a system, without the further requirement used just below here, of additivity even when there is inhomogeneity of the added systems.
Examples of extensive thermodynamic variables are total mass and total volume. Under the above definition, entropy is also regarded as an extensive variable. Examples of intensive thermodynamic variables are temperature, pressure, and chemical concentration; intensive thermodynamic variables are defined at each spatial point and each instant of time in a system. Physical macroscopic variables can be mechanical, material, or thermal.^[27] Temperature is a thermal variable; according to Guggenheim, "the most important conception in thermodynamics is temperature."^[8]
Intensive variables have the property that if any number of systems, each in its own separate homogeneous thermodynamic equilibrium state, all with the same respective values of all of their intensive variables, regardless of the values of their extensive variables, are laid contiguously with no partition between them, so as to form a new system, then the values of the intensive variables of the new system are the same as those of the separate constituent systems. Such a composite system is in a homogeneous thermodynamic equilibrium. Examples of intensive variables are temperature, chemical concentration, pressure, density of mass, density of internal energy, and, when it can be properly defined, density of entropy.^[99] In other words, intensive variables are not altered by the thermodynamic operation of scaling.
For the immediately present account just below, an alternative definition of extensive variables is considered, that requires that if any number of systems, regardless of their possible separate thermodynamic equilibrium or non-equilibrium states or intensive variables, are laid side by side with no partition between them so as to form a new system, then the values of the extensive variables of the new system are the sums of the values of the respective extensive variables of the individual separate constituent systems. Obviously, there is no reason to expect such a composite system to be in a homogeneous thermodynamic equilibrium. Examples of extensive variables in this alternative definition are mass, volume, and internal energy. They depend on the total quantity of mass in the system.^[100] In other words, although extensive variables scale with the system under the thermodynamic operation of scaling, nevertheless the present alternative definition of an extensive variable requires more than this: it requires also its additivity regardless of the inhomogeneity (or equality or inequality of the values of the intensive variables) of the component systems.
Though, when it can be properly defined, density of entropy is an intensive variable, for inhomogeneous systems, entropy itself does not fit into this alternative classification of state variables.^[101]^[102] The reason is that entropy is a property of a system as a whole, and not necessarily related simply to its constituents separately. It is true that for any number of systems each in its own separate homogeneous thermodynamic equilibrium, all with the same values of intensive variables, removal of the partitions between the separate systems results in a composite homogeneous system in thermodynamic equilibrium, with all the values of its intensive variables the same as those of the constituent systems, and it is reservedly or conditionally true that the entropy of such a restrictively defined composite system is the sum of the entropies of the constituent systems. But if the constituent systems do not satisfy these restrictive conditions, the entropy of a composite system cannot be expected to be the sum of the entropies of the constituent systems, because the entropy is a property of the composite system as a whole. Therefore, though under these restrictive reservations, entropy satisfies some requirements for extensivity defined just above, entropy in general does not fit the immediately present definition of an extensive variable.
Being neither an intensive variable nor an extensive variable according to the immediately present definition, entropy is thus a stand-out variable, because it is a state variable of a system as a whole.^[101] A non-equilibrium system can have a very inhomogeneous dynamical structure. This is one reason for distinguishing the study of equilibrium thermodynamics from the study of non-equilibrium thermodynamics.
The physical reason for the existence of extensive variables is the time-invariance of volume in a given inertial reference frame, and the strictly local conservation of mass, momentum, angular momentum, and energy. As noted by Gibbs, entropy is unlike energy and mass, because it is not locally conserved.^[101] The stand-out quantity entropy is never conserved in real physical processes; all real physical processes are irreversible.^[103] The motion of planets seems reversible on a short time scale (millions of years), but their motion, according to Newton's laws, is mathematically an example of deterministic chaos. Eventually a planet suffers an unpredictable collision with an object from its surroundings, outer space in this case, and consequently its future course is radically unpredictable. Theoretically this can be expressed by saying that every natural process dissipates some information from the predictable part of its activity into the unpredictable part. The predictable part is expressed in the generalized mechanical variables, and the unpredictable part in heat.
Other state variables can be regarded as conditionally 'extensive' subject to reservation as above, but not extensive as defined above. Examples are the Gibbs free energy, the Helmholtz free energy, and the enthalpy. Consequently, just because for some systems under particular conditions of their surroundings such state variables are conditionally conjugate to intensive variables, such conjugacy does not make such state variables extensive as defined above. This is another reason for distinguishing the study of equilibrium thermodynamics from the study of non-equilibrium thermodynamics. In another way of thinking, this explains why heat is to be regarded as a quantity that refers to a process and not to a state of a system.
A system with no internal partitions, and in thermodynamic equilibrium, can be inhomogeneous in the following respect: it can consist of several so-called 'phases', each homogeneous in itself, in immediate contiguity with other phases of the system, but distinguishable by their having various respectively different physical characters, with discontinuity of intensive variables at the boundaries between the phases; a mixture of different chemical species is considered homogeneous for this purpose if it is physically homogeneous.^[104] For example, a vessel can contain a system consisting of water vapour overlying liquid water; then there is a vapour phase and a liquid phase, each homogeneous in itself, but still in thermodynamic equilibrium with the other phase. For the immediately present account, systems with multiple phases are not considered, though for many thermodynamic questions, multiphase systems are important.

Equation of state

The macroscopic variables of a thermodynamic system in thermodynamic equilibrium, in which temperature is well defined, can be related to one another through equations of state or characteristic equations.^[23]^[24]^[25]^[26] They express the constitutive peculiarities of the material of the system. The equation of state must comply with some thermodynamic constraints, but cannot be derived from the general principles of thermodynamics alone.

Thermodynamic processes between states of thermodynamic equilibrium

A thermodynamic process is defined by changes of state internal to the system of interest, combined with transfers of matter and energy to and from the surroundings of the system or to and from other systems. A system is demarcated from its surroundings or from other systems by partitions that more or less separate them, and may move as a piston to change the volume of the system and thus transfer work.

Dependent and independent variables for a process

A process is described by changes in values of state variables of systems or by quantities of exchange of matter and energy between systems and surroundings. The change must be specified in terms of prescribed variables. The choice of which variables are to be used is made in advance of consideration of the course of the process, and cannot be changed. Certain of the variables chosen in advance are called the independent variables.^[105] From changes in independent variables may be derived changes in other variables called dependent variables. For example a process may occur at constant pressure with pressure prescribed as an independent variable, and temperature changed as another independent variable, and then changes in volume are considered as dependent. Careful attention to this principle is necessary in thermodynamics.^[106]^[107]

Changes of state of a system

In the approach through equilibrium states of the system, a process can be described in two main ways.
In one way, the system is considered to be connected to the surroundings by some kind of more or less separating partition, and allowed to reach equilibrium with the surroundings with that partition in place. Then, while the separative character of the partition is kept unchanged, the conditions of the surroundings are changed, and exert their influence on the system again through the separating partition, or the partition is moved so as to change the volume of the system; and a new equilibrium is reached. For example, a system is allowed to reach equilibrium with a heat bath at one temperature; then the temperature of the heat bath is changed and the system is allowed to reach a new equilibrium; if the partition allows conduction of heat, the new equilibrium is different from the old equilibrium.
In the other way, several systems are connected to one another by various kinds of more or less separating partitions, and to reach equilibrium with each other, with those partitions in place. In this way, one may speak of a 'compound system'. Then one or more partitions is removed or changed in its separative properties or moved, and a new equilibrium is reached. The Joule-Thomson experiment is an example of this; a tube of gas is separated from another tube by a porous partition; the volume available in each of the tubes is determined by respective pistons; equilibrium is established with an initial set of volumes; the volumes are changed and a new equilibrium is established.^[108]^[109]^[110]^[111]^[112] Another example is in separation and mixing of gases, with use of chemically semi-permeable membranes.^[113]

Commonly considered thermodynamic processes

It is often convenient to study a thermodynamic process in which a single variable, such as temperature, pressure, or volume, etc., is held fixed. Furthermore, it is useful to group these processes into pairs, in which each variable held constant is one member of a conjugate pair.
Several commonly studied thermodynamic processes are:

Isobaric process: occurs at constant pressure
Isochoric process: occurs at constant volume (also called isometric/isovolumetric)
Isothermal process: occurs at a constant temperature
Adiabatic process: occurs without loss or gain of energy as heat
Isentropic process: a reversible adiabatic process occurs at a constant entropy, but is a fictional idealization. Conceptually it is possible to actually physically conduct a process that keeps the entropy of the system constant, allowing systematically controlled removal of heat, by conduction to a cooler body, to compensate for entropy produced within the system by irreversible work done on the system. Such isentropic conduct of a process seems called for when the entropy of the system is considered as an independent variable, as for example when the internal energy is considered as a function of the entropy and volume of the system, the natural variables of the internal energy as studied by Gibbs.
Isenthalpic process: occurs at a constant enthalpy
Isolated process: no matter or energy (neither as work nor as heat) is transferred into or out of the system

It is sometimes of interest to study a process in which several variables are controlled, subject to some specified constraint. In a system in which a chemical reaction can occur, for example, in which the pressure and temperature can affect the equilibrium composition, a process might occur in which temperature is held constant but pressure is slowly altered, just so that chemical equilibrium is maintained all the way. There is a corresponding process at constant temperature in which the final pressure is the same but is reached by a rapid jump. Then it can be shown that the volume change resulting from the rapid jump process is smaller than that from the slow equilibrium process.^[114] The work transferred differs between the two processes.

Account in terms of cyclic processes

A cyclic process^[20] is a process that can be repeated indefinitely often without changing the final state of the system in which the process occurs. The only traces of the effects of a cyclic process are to be found in the surroundings of the system or in other systems. This is the kind of process that concerned early thermodynamicists such as Carnot, and in terms of which Kelvin defined absolute temperature,^[115]^[116] before the use of the quantity of entropy by Rankine^[117] and its clear identification by Clausius.^[118] For some systems, for example with some plastic working substances, cyclic processes are practically nearly unfeasible because the working substance undergoes practically irreversible changes.^[60] This is why mechanical devices are lubricated with oil and one of the reasons why electrical devices are often useful.
A cyclic process of a system requires in its surroundings at least two heat reservoirs at different temperatures, one at a higher temperature that supplies heat to the system, the other at a lower temperature that accepts heat from the system. The early work on thermodynamics tended to use the cyclic process approach, because it was interested in machines that converted some of the heat from the surroundings into mechanical power delivered to the surroundings, without too much concern about the internal workings of the machine. Such a machine, while receiving an amount of heat from a higher temperature reservoir, always needs a lower temperature reservoir that accepts some lesser amount of heat. The difference in amounts of heat is equal to the amount of heat converted to work.^[83]^[119] Later, the internal workings of a system became of interest, and they are described by the states of the system. Nowadays, instead of arguing in terms of cyclic processes, some writers are inclined to derive the concept of absolute temperature from the concept of entropy, a variable of state.

Instrumentation

There are two types of thermodynamic instruments, the meter and the reservoir. A thermodynamic meter is any device that measures any parameter of a thermodynamic system. In some cases, the thermodynamic parameter is actually defined in terms of an idealized measuring instrument. For example, the zeroth law states that if two bodies are in thermal equilibrium with a third body, they are also in thermal equilibrium with each other. This principle, as noted by James Maxwell in 1872, asserts that it is possible to measure temperature. An idealized thermometer is a sample of an ideal gas at constant pressure. From the ideal gas law PV=nRT, the volume of such a sample can be used as an indicator of temperature; in this manner it defines temperature. Although pressure is defined mechanically, a pressure-measuring device, called a barometer may also be constructed from a sample of an ideal gas held at a constant temperature. A calorimeter is a device that measures and define the internal energy of a system.
A thermodynamic reservoir is a system so large that it does not appreciably alter its state parameters when brought into contact with the test system. It is used to impose a particular value of a state parameter upon the system. For example, a pressure reservoir is a system at a particular pressure, which imposes that pressure upon any test system that it is mechanically connected to. The Earth's atmosphere is often used as a pressure reservoir.

Conjugate variables

Main article: Conjugate variables

A central concept of thermodynamics is that of energy. By the First Law, the total energy of a system and its surroundings is conserved. Energy may be transferred into a system by heating, compression, or addition of matter, and extracted from a system by cooling, expansion, or extraction of matter. In mechanics, for example, energy transfer equals the product of the force applied to a body and the resulting displacement.
Conjugate variables are pairs of thermodynamic concepts, with the first being akin to a "force" applied to some thermodynamic system, the second being akin to the resulting "displacement," and the product of the two equalling the amount of energy transferred. The common conjugate variables are:

Pressure-volume (the mechanical parameters);
Temperature-entropy (thermal parameters);
Chemical potential-particle number (material parameters).

Potentials

Thermodynamic potentials are different quantitative measures of the stored energy in a system. Potentials are used to measure energy changes in systems as they evolve from an initial state to a final state. The potential used depends on the constraints of the system, such as constant temperature or pressure. For example, the Helmholtz and Gibbs energies are the energies available in a system to do useful work when the temperature and volume or the pressure and temperature are fixed, respectively.
The five most well known potentials are:

Name	Symbol	Formula	Natural variables
Internal energy	$U$	$\int ( T dS - p dV + \sum_i \mu_i dN_i )$	$S, V, \{N_i\}$
Helmholtz free energy	$F$	$U-TS$	$T, V, \{N_i\}$
Enthalpy	$H$	$U+pV$	$S, p, \{N_i\}$
Gibbs free energy	$G$	$U+pV-TS$	$T, p, \{N_i\}$
Landau Potential (Grand potential)	$\Omega$ , $\Phi_{G}$	$U - T S -$ $\sum_i\,$ $\mu_i N_i$	$T, V, \{\mu_i\}$

where $T$ is the temperature, $S$ the entropy, $p$ the pressure, $V$ the volume, $\mu$ the chemical potential, $N$ the number of particles in the system, and $i$ is the count of particles types in the system.
Thermodynamic potentials can be derived from the energy balance equation applied to a thermodynamic system. Other thermodynamic potentials can also be obtained through Legendre transformation.

Axiomatics

Most accounts of thermodynamics presuppose the law of conservation of mass, sometimes with,^[120] and sometimes without,^[121]^[122]^[123] explicit mention. Particular attention is paid to the law in accounts of non-equilibrium thermodynamics.^[124]^[125] One statement of this law is "The total mass of a closed system remains constant."^[9] Another statement of it is "In a chemical reaction, matter is neither created nor destroyed."^[126] Implied in this is that matter and energy are not considered to be interconverted in such accounts. The full generality of the law of conservation of energy is thus not used in such accounts.
In 1909, Constantin Carathéodory presented^[48] a purely mathematical axiomatic formulation, a description often referred to as geometrical thermodynamics, and sometimes said to take the "mechanical approach"^[78] to thermodynamics. The Carathéodory formulation is restricted to equilibrium thermodynamics and does not attempt to deal with non-equilibrium thermodynamics, forces that act at a distance on the system, or surface tension effects.^[127] Moreover, Carathéodory's formulation does not deal with materials like water near 4 °C, which have a density extremum as a function of temperature at constant pressure.^[128]^[129] Carathéodory used the law of conservation of energy as an axiom from which, along with the contents of the zeroth law, and some other assumptions including his own version of the second law, he derived the first law of thermodynamics.^[130] Consequently one might also describe Carathėodory's work as lying in the field of energetics,^[131] which is broader than thermodynamics. Carathéodory presupposed the law of conservation of mass without explicit mention of it.
Since the time of Carathėodory, other influential axiomatic formulations of thermodynamics have appeared, which like Carathéodory's, use their own respective axioms, different from the usual statements of the four laws, to derive the four usually stated laws.^[132]^[133]^[134]
Many axiomatic developments assume the existence of states of thermodynamic equilibrium and of states of thermal equilibrium. States of thermodynamic equilibrium of compound systems allow their component simple systems to exchange heat and matter and to do work on each other on their way to overall joint equilibrium. Thermal equilibrium allows them only to exchange heat. The physical properties of glass depend on its history of being heated and cooled and, strictly speaking, glass is not in thermodynamic equilibrium.^[63]
According to Herbert Callen's widely cited 1985 text on thermodynamics: "An essential prerequisite for the measurability of energy is the existence of walls that do not permit transfer of energy in the form of heat.".^[135] According to Werner Heisenberg's mature and careful examination of the basic concepts of physics, the theory of heat has a self-standing place.^[136]
From the viewpoint of the axiomatist, there are several different ways of thinking about heat, temperature, and the second law of thermodynamics. The Clausius way rests on the empirical fact that heat is conducted always down, never up, a temperature gradient. The Kelvin way is to assert the empirical fact that conversion of heat into work by cyclic processes is never perfectly efficient. A more mathematical way is to assert the existence of a function of state called the entropy that tells whether a hypothesized process occurs spontaneously in nature. A more abstract way is that of Carathéodory that in effect asserts the irreversibility of some adiabatic processes. For these different ways, there are respective corresponding different ways of viewing heat and temperature.
The Clausius–Kelvin–Planck way This way prefers ideas close to the empirical origins of thermodynamics. It presupposes transfer of energy as heat, and empirical temperature as a scalar function of state. According to Gislason and Craig (2005): "Most thermodynamic data come from calorimetry..."^[137] According to Kondepudi (2008): "Calorimetry is widely used in present day laboratories."^[138] In this approach, what is often currently called the zeroth law of thermodynamics is deduced as a simple consequence of the presupposition of the nature of heat and empirical temperature, but it is not named as a numbered law of thermodynamics. Planck attributed this point of view to Clausius, Kelvin, and Maxwell. Planck wrote (on page 90 of the seventh edition, dated 1922, of his treatise) that he thought that no proof of the second law of thermodynamics could ever work that was not based on the impossibility of a perpetual motion machine of the second kind. In that treatise, Planck makes no mention of the 1909 Carathéodory way, which was well known by 1922. Planck for himself chose a version of what is just above called the Kelvin way.^[139] The development by Truesdell and Bharatha (1977) is so constructed that it can deal naturally with cases like that of water near 4 °C.^[133]
The way that assumes the existence of entropy as a function of state This way also presupposes transfer of energy as heat, and it presupposes the usually stated form of the zeroth law of thermodynamics, and from these two it deduces the existence of empirical temperature. Then from the existence of entropy it deduces the existence of absolute thermodynamic temperature.^[8]^[132]
The Carathéodory way This way presupposes that the state of a simple one-phase system is fully specifiable by just one more state variable than the known exhaustive list of mechanical variables of state. It does not explicitly name empirical temperature, but speaks of the one-dimensional "non-deformation coordinate". This satisfies the definition of an empirical temperature, that lies on a one-dimensional manifold. The Carathéodory way needs to assume moreover that the one-dimensional manifold has a definite sense, which determines the direction of irreversible adiabatic process, which is effectively assuming that heat is conducted from hot to cold. This way presupposes the often currently stated version of the zeroth law, but does not actually name it as one of its axioms.^[127] According to one author, Carathéodory's principle, which is his version of the second law of thermodynamics, does not imply the increase of entropy when work is done under adiabatic conditions (as was noted by Planck^[140]). Thus Carathéodory's way leaves unstated a further empirical fact that is needed for a full expression of the second law of thermodynamics.^[141]

Scope of thermodynamics

Originally thermodynamics concerned material and radiative phenomena that are experimentally reproducible. For example, a state of thermodynamic equilibrium is a steady state reached after a system has aged so that it no longer changes with the passage of time. But more than that, for thermodynamics, a system, defined by its being prepared in a certain way must, consequent on every particular occasion of preparation, upon aging, reach one and the same eventual state of thermodynamic equilibrium, entirely determined by the way of preparation. Such reproducibility is because the systems consist of so many molecules that the molecular variations between particular occasions of preparation have negligible or scarcely discernable effects on the macroscopic variables that are used in thermodynamic descriptions. This led to Boltzmann's discovery that entropy had a statistical or probabilistic nature. Probabilistic and statistical explanations arise from the experimental reproducibility of the phenomena.^[142]
Gradually, the laws of thermodynamics came to be used to explain phenomena that occur outside the experimental laboratory. For example, phenomena on the scale of the earth's atmosphere cannot be reproduced in a laboratory experiment. But processes in the atmosphere can be modeled by use of thermodynamic ideas, extended well beyond the scope of laboratory equilibrium thermodynamics.^[143]^[144]^[145] A parcel of air can, near enough for many studies, be considered as a closed thermodynamic system, one that is allowed to move over significant distances. The pressure exerted by the surrounding air on the lower face of a parcel of air may differ from that on its upper face. If this results in rising of the parcel of air, it can be considered to have gained potential energy as a result of work being done on it by the combined surrounding air below and above it. As it rises, such a parcel usually expands because the pressure is lower at the higher altitudes that it reaches. In that way, the rising parcel also does work on the surrounding atmosphere. For many studies, such a parcel can be considered nearly to neither gain nor lose energy by heat conduction to its surrounding atmosphere, and its rise is rapid enough to leave negligible time for it to gain or lose heat by radiation; consequently the rising of the parcel is near enough adiabatic. Thus the adiabatic gas law accounts for its internal state variables, provided that there is no precipitation into water droplets, no evaporation of water droplets, and no sublimation in the process. More precisely, the rising of the parcel is likely to occasion friction and turbulence, so that some potential and some kinetic energy of bulk converts into internal energy of air considered as effectively stationary. Friction and turbulence thus oppose the rising of the parcel.^[146]^[147]

Applied fields

Electromagnetism

From Wikipedia, the free encyclopedia

This article includes a list of references, but its sources remain unclear because it has insufficient inline citations. Please help to improve this article by introducing more precise citations. (November 2012)

Electromagnetism

Electricity Magnetism
Electrostatics [show]
Magnetostatics [show]
Electrodynamics [hide] Lorentz force law Electromagnetic induction Faraday's law Lenz's law Displacement current Maxwell's equations Electromagnetic field Electromagnetic radiation Maxwell tensor Poynting vector Liénard–Wiechert potential Jefimenko's equations Eddy current
Electrical network [show]
Covariant formulation [show]
Scientists[show]
v t e

Electromagnetism, or the electromagnetic force is one of the four fundamental interactions in nature, the other three being the strong interaction, the weak interaction, and gravitation. This force is described by electromagnetic fields, and has innumerable physical instances including the interaction of electrically charged particles and the interaction of uncharged magnetic force fields with electrical conductors.
The word electromagnetism is a compound form of two Greek terms, ἢλεκτρον, ēlektron, "amber", and μαγνήτης, magnētēs, "magnet". The science of electromagnetic phenomena is defined in terms of the electromagnetic force, sometimes called the Lorentz force, which includes both electricity and magnetism as elements of one phenomenon.
During the quark epoch, the electroweak force split into the electromagnetic and weak force. The electromagnetic force plays a major role in determining the internal properties of most objects encountered in daily life. Ordinary matter takes its form as a result of intermolecular forces between individual molecules in matter. Electrons are bound by electromagnetic wave mechanics into orbitals around atomic nuclei to form atoms, which are the building blocks of molecules. This governs the processes involved in chemistry, which arise from interactions between the electrons of neighboring atoms, which are in turn determined by the interaction between electromagnetic force and the momentum of the electrons.
There are numerous mathematical descriptions of the electromagnetic field. In classical electrodynamics, electric fields are described as electric potential and electric current in Ohm's law, magnetic fields are associated with electromagnetic induction and magnetism, and Maxwell's equations describe how electric and magnetic fields are generated and altered by each other and by charges and currents.
The theoretical implications of electromagnetism, in particular the establishment of the speed of light based on properties of the "medium" of propagation (permeability and permittivity), led to the development of special relativity by Albert Einstein in 1905.

History of the theory

Hans Christian Ørsted

Originally electricity and magnetism were thought of as two separate forces. This view changed, however, with the publication of James Clerk Maxwell's 1873 Treatise on Electricity and Magnetism in which the interactions of positive and negative charges were shown to be regulated by one force. There are four main effects resulting from these interactions, all of which have been clearly demonstrated by experiments:

Electric charges attract or repel one another with a force inversely proportional to the square of the distance between them: unlike charges attract, like ones repel.
Magnetic poles (or states of polarization at individual points) attract or repel one another in a similar way and always come in pairs: every north pole is yoked to a south pole.
An electric current in a wire creates a circular magnetic field around the wire, its direction (clockwise or counter-clockwise) depending on that of the current.
A current is induced in a loop of wire when it is moved towards or away from a magnetic field, or a magnet is moved towards or away from it, the direction of current depending on that of the movement.

André-Marie Ampère

While preparing for an evening lecture on 21 April 1820, Hans Christian Ørsted made a surprising observation. As he was setting up his materials, he noticed a compass needle deflected from magnetic north when the electric current from the battery he was using was switched on and off. This deflection convinced him that magnetic fields radiate from all sides of a wire carrying an electric current, just as light and heat do, and that it confirmed a direct relationship between electricity and magnetism.

Michael Faraday

At the time of discovery, Ørsted did not suggest any satisfactory explanation of the phenomenon, nor did he try to represent the phenomenon in a mathematical framework. However, three months later he began more intensive investigations. Soon thereafter he published his findings, proving that an electric current produces a magnetic field as it flows through a wire. The CGS unit of magnetic induction (oersted) is named in honor of his contributions to the field of electromagnetism.

James Clerk Maxwell

His findings resulted in intensive research throughout the scientific community in electrodynamics. They influenced French physicist André-Marie Ampère's developments of a single mathematical form to represent the magnetic forces between current-carrying conductors. Ørsted's discovery also represented a major step toward a unified concept of energy.
This unification, which was observed by Michael Faraday, extended by James Clerk Maxwell, and partially reformulated by Oliver Heaviside and Heinrich Hertz, is one of the key accomplishments of 19th century mathematical physics. It had far-reaching consequences, one of which was the understanding of the nature of light. Unlike what was proposed in Electromagnetism, light and other electromagnetic waves are at the present seen as taking the form of quantized, self-propagating oscillatory electromagnetic field disturbances which have been called photons. Different frequencies of oscillation give rise to the different forms of electromagnetic radiation, from radio waves at the lowest frequencies, to visible light at intermediate frequencies, to gamma rays at the highest frequencies.
Ørsted was not the only person to examine the relation between electricity and magnetism. In 1802 Gian Domenico Romagnosi, an Italian legal scholar, deflected a magnetic needle by electrostatic charges. Actually, no galvanic current existed in the setup and hence no electromagnetism was present. An account of the discovery was published in 1802 in an Italian newspaper, but it was largely overlooked by the contemporary scientific community.^[1]

Overview

The electromagnetic force is one of the four known fundamental forces. The other fundamental forces are:

the weak nuclear force, which binds to all known particles in the Standard Model, and causes certain forms of radioactive decay. (In particle physics though, the electroweak interaction is the unified description of two of the four known fundamental interactions of nature: electromagnetism and the weak interaction);
the strong nuclear force, which binds quarks to form nucleons, and binds nucleons to form nuclei
the gravitational force.

All other forces (e.g., friction) are ultimately derived from these fundamental forces and momentum carried by the movement of particles.
The electromagnetic force is the one responsible for practically all the phenomena one encounters in daily life above the nuclear scale, with the exception of gravity. Roughly speaking, all the forces involved in interactions between atoms can be explained by the electromagnetic force acting on the electrically charged atomic nuclei and electrons inside and around the atoms, together with how these particles carry momentum by their movement. This includes the forces we experience in "pushing" or "pulling" ordinary material objects, which come from the intermolecular forces between the individual molecules in our bodies and those in the objects. It also includes all forms of chemical phenomena.
A necessary part of understanding the intra-atomic to intermolecular forces is the effective force generated by the momentum of the electrons' movement, and that electrons move between interacting atoms, carrying momentum with them. As a collection of electrons becomes more confined, their minimum momentum necessarily increases due to the Pauli exclusion principle. The behaviour of matter at the molecular scale including its density is determined by the balance between the electromagnetic force and the force generated by the exchange of momentum carried by the electrons themselves.

Classical electrodynamics

Main article: Classical electrodynamics

The scientist William Gilbert proposed, in his De Magnete (1600), that electricity and magnetism, while both capable of causing attraction and repulsion of objects, were distinct effects. Mariners had noticed that lightning strikes had the ability to disturb a compass needle, but the link between lightning and electricity was not confirmed until Benjamin Franklin's proposed experiments in 1752. One of the first to discover and publish a link between man-made electric current and magnetism was Romagnosi, who in 1802 noticed that connecting a wire across a voltaic pile deflected a nearby compass needle. However, the effect did not become widely known until 1820, when Ørsted performed a similar experiment.^[2] Ørsted's work influenced Ampère to produce a theory of electromagnetism that set the subject on a mathematical foundation.
A theory of electromagnetism, known as classical electromagnetism, was developed by various physicists over the course of the 19th century, culminating in the work of James Clerk Maxwell, who unified the preceding developments into a single theory and discovered the electromagnetic nature of light. In classical electromagnetism, the electromagnetic field obeys a set of equations known as Maxwell's equations, and the electromagnetic force is given by the Lorentz force law.
One of the peculiarities of classical electromagnetism is that it is difficult to reconcile with classical mechanics, but it is compatible with special relativity. According to Maxwell's equations, the speed of light in a vacuum is a universal constant, dependent only on the electrical permittivity and magnetic permeability of free space. This violates Galilean invariance, a long-standing cornerstone of classical mechanics. One way to reconcile the two theories (electromagnetism and classical mechanics) is to assume the existence of a luminiferous aether through which the light propagates. However, subsequent experimental efforts failed to detect the presence of the aether. After important contributions of Hendrik Lorentz and Henri Poincaré, in 1905, Albert Einstein solved the problem with the introduction of special relativity, which replaces classical kinematics with a new theory of kinematics that is compatible with classical electromagnetism. (For more information, see History of special relativity.)
In addition, relativity theory shows that in moving frames of reference a magnetic field transforms to a field with a nonzero electric component and vice versa; thus firmly showing that they are two sides of the same coin, and thus the term "electromagnetism". (For more information, see Classical electromagnetism and special relativity and Covariant formulation of classical electromagnetism.

Photoelectric effect

Main article: Photoelectric effect

In another paper published in that same year, Albert Einstein undermined the very foundations of classical electromagnetism. In his theory of the photoelectric effect (for which he won the Nobel prize for physics) and inspired by the idea of Max Planck's "quanta", he posited that light could exist in discrete particle-like quantities as well, which later came to be known as photons. Einstein's theory of the photoelectric effect extended the insights that appeared in the solution of the ultraviolet catastrophe presented by Max Planck in 1900. In his work, Planck showed that hot objects emit electromagnetic radiation in discrete packets ("quanta"), which leads to a finite total energy emitted as black body radiation. Both of these results were in direct contradiction with the classical view of light as a continuous wave. Planck's and Einstein's theories were progenitors of quantum mechanics, which, when formulated in 1925, necessitated the invention of a quantum theory of electromagnetism. This theory, completed in the 1940s-1950s, is known as quantum electrodynamics (or "QED"), and, in situations where perturbation theory is applicable, is one of the most accurate theories known to physics.

Quantities and units

Electromagnetic units are part of a system of electrical units based primarily upon the magnetic properties of electric currents, the fundamental SI unit being the ampere. The units are:

ampere (electric current)
coulomb (electric charge)
farad (capacitance)
henry (inductance)
ohm (resistance)
tesla (magnetic flux density)
volt (electric potential)
watt (power)
weber (magnetic flux)

In the electromagnetic cgs system, electric current is a fundamental quantity defined via Ampère's law and takes the permeability as a dimensionless quantity (relative permeability) whose value in a vacuum is unity. As a consequence, the square of the speed of light appears explicitly in some of the equations interrelating quantities in this system.

SI electromagnetism units v t e
Symbol^[3]	Name of Quantity	Derived Units	Unit	Base Units
I	electric current	ampere (SI base unit)	A	A (= W/V = C/s)
Q	electric charge	coulomb	C	A⋅s
U, ΔV, Δφ; E	potential difference; electromotive force	volt	V	kg⋅m²⋅s⁻³⋅A⁻¹ (= J/C)
R; Z; X	electric resistance; impedance; reactance	ohm	Ω	kg⋅m²⋅s⁻³⋅A⁻² (= V/A)
ρ	resistivity	ohm metre	Ω⋅m	kg⋅m³⋅s⁻³⋅A⁻²
P	electric power	watt	W	kg⋅m²⋅s⁻³ (= V⋅A)
C	capacitance	farad	F	kg⁻¹⋅m⁻²⋅s⁴⋅A² (= C/V)
E	electric field strength	volt per metre	V/m	kg⋅m⋅s⁻³⋅A⁻¹ (= N/C)
D	electric displacement field	coulomb per square metre	C/m²	A⋅s⋅m⁻²
ε	permittivity	farad per metre	F/m	kg⁻¹⋅m⁻³⋅s⁴⋅A²
χ_e	electric susceptibility	(dimensionless)	–	–
G; Y; B	conductance; admittance; susceptance	siemens	S	kg⁻¹⋅m⁻²⋅s³⋅A² (= Ω⁻¹)
κ, γ, σ	conductivity	siemens per metre	S/m	kg⁻¹⋅m⁻³⋅s³⋅A²
B	magnetic flux density, magnetic induction	tesla	T	kg⋅s⁻²⋅A⁻¹ (= Wb/m² = N⋅A⁻¹⋅m⁻¹)
Φ	magnetic flux	weber	Wb	kg⋅m²⋅s⁻²⋅A⁻¹ (= V⋅s)
H	magnetic field strength	ampere per metre	A/m	A⋅m⁻¹
L, M	inductance	henry	H	kg⋅m²⋅s⁻²⋅A⁻² (= Wb/A = V⋅s/A)
μ	permeability	henry per metre	H/m	kg⋅m⋅s⁻²⋅A⁻²
χ	magnetic susceptibility	(dimensionless)	–	–

Formulas for physical laws of electromagnetism (such as Maxwell's equations) need to be adjusted depending on what system of units one uses. This is because there is no one-to-one correspondence between electromagnetic units in SI and those in CGS, as is the case for mechanical units. Furthermore, within CGS, there are several plausible choices of electromagnetic units, leading to different unit "sub-systems", including Gaussian, "ESU", "EMU", and Heaviside–Lorentz. Among these choices, Gaussian units are the most common today, and in fact the phrase "CGS units" is often used to refer specifically to CGS-Gaussian units.

Electromagnetic phenomena

With the exception of gravitation, electromagnetic phenomena as described by quantum electrodynamics (which includes classical electrodynamics as a limiting case) account for almost all physical phenomena observable to the unaided human senses, including light and other electromagnetic radiation, all of chemistry, most of mechanics (excepting gravitation), and, of course, magnetism and electricity. Magnetic monopoles (and "Gilbert" dipoles) are not strictly electromagnetic phenomena, since in standard electromagnetism, magnetic fields are generated not by true "magnetic charge" but by currents. There are, however, condensed matter analogs of magnetic monopoles in exotic materials (spin ice) created in the laboratory.^[4]

Electromagnetic induction

Electromagnetic Induction is the Induction of an electromotive force in a circuit by varying the magnetic flux linked with the circuit. The phenomenon was first investigated in 1830-31 by Joseph Henry and Michael Faraday, who discovered that when the magnetic field around an electromagnet was increased and decreased, an electric current should be detected by nearby conductor. A current can also be induced by constantly moving a permanent magnet in and out of a coil of wire, or by constantly moving a conductor near a stationary permanent magnet. The induced electromotive force is proportional to the rate of change of the magnetic flux cutting across the circuit.

WAVES

This article is about waves in the scientific sense. For waves on the surface of the ocean or lakes, see Wind wave. For other uses of wave or waves, see Wave (disambiguation).

Surface waves in water

In physics, a wave is a disturbance or oscillation that travels through space and matter, accompanied by a transfer of energy. Wave motion transfers energy from one point to another, often with no permanent displacement of the particles of the medium—that is, with little or no associated mass transport. They consist, instead, of oscillations or vibrations around almost fixed locations. Waves are described by a wave equation which sets out how the disturbance proceeds over time. The mathematical form of this equation varies depending on the type of wave.
There are two main types of waves. Mechanical waves propagate through a medium, and the substance of this medium is deformed. The deformation reverses itself owing to restoring forces resulting from its deformation. For example, sound waves propagate via air molecules colliding with their neighbors. When air molecules collide, they also bounce away from each other (a restoring force). This keeps the molecules from continuing to travel in the direction of the wave.
The second main type of wave, electromagnetic waves, do not require a medium. Instead, they consist of periodic oscillations of electrical and magnetic fields generated by charged particles, and can therefore travel through a vacuum. These types of waves vary in wavelength, and include radio waves, microwaves, infrared radiation, visible light, ultraviolet radiation, X-rays, and gamma rays.
Further, the behavior of particles in quantum mechanics are described by waves, and researchers believe that gravitational waves also travel through space, although gravitational waves have never been directly detected.
A wave can be transverse or longitudinal depending on the direction of its oscillation. Transverse waves occur when a disturbance creates oscillations perpendicular (at right angles) to the propagation (the direction of energy transfer). Longitudinal waves occur when the oscillations are parallel to the direction of propagation. While mechanical waves can be both transverse and longitudinal, all electromagnetic waves are transverse.

General features

A single, all-encompassing definition for the term wave is not straightforward. A vibration can be defined as a back-and-forth motion around a reference value. However, a vibration is not necessarily a wave. An attempt to define the necessary and sufficient characteristics that qualify a phenomenon to be called a wave results in a fuzzy border line.
The term wave is often intuitively understood as referring to a transport of spatial disturbances that are generally not accompanied by a motion of the medium occupying this space as a whole. In a wave, the energy of a vibration is moving away from the source in the form of a disturbance within the surrounding medium (Hall 1980, p. 8). However, this notion is problematic for a standing wave (for example, a wave on a string), where energy is moving in both directions equally, or for electromagnetic (e.g., light) waves in a vacuum, where the concept of medium does not apply and interaction with a target is the key to wave detection and practical applications. There are water waves on the ocean surface; gamma waves and light waves emitted by the Sun; microwaves used in microwave ovens and in radar equipment; radio waves broadcast by radio stations; and sound waves generated by radio receivers, telephone handsets and living creatures (as voices), to mention only a few wave phenomena.
It may appear that the description of waves is closely related to their physical origin for each specific instance of a wave process. For example, acoustics is distinguished from optics in that sound waves are related to a mechanical rather than an electromagnetic wave transfer caused by vibration. Concepts such as mass, momentum, inertia, or elasticity, become therefore crucial in describing acoustic (as distinct from optic) wave processes. This difference in origin introduces certain wave characteristics particular to the properties of the medium involved. For example, in the case of air: vortices, radiation pressure, shock waves etc.; in the case of solids: Rayleigh waves, dispersion; and so on.
Other properties, however, although usually described in terms of origin, may be generalized to all waves. For such reasons, wave theory represents a particular branch of physics that is concerned with the properties of wave processes independently of their physical origin.^[1] For example, based on the mechanical origin of acoustic waves, a moving disturbance in space–time can exist if and only if the medium involved is neither infinitely stiff nor infinitely pliable. If all the parts making up a medium were rigidly bound, then they would all vibrate as one, with no delay in the transmission of the vibration and therefore no wave motion. On the other hand, if all the parts were independent, then there would not be any transmission of the vibration and again, no wave motion. Although the above statements are meaningless in the case of waves that do not require a medium, they reveal a characteristic that is relevant to all waves regardless of origin: within a wave, the phase of a vibration (that is, its position within the vibration cycle) is different for adjacent points in space because the vibration reaches these points at different times.
Similarly, wave processes revealed from the study of waves other than sound waves can be significant to the understanding of sound phenomena. A relevant example is Thomas Young's principle of interference (Young, 1802, in Hunt 1992, p. 132). This principle was first introduced in Young's study of light and, within some specific contexts (for example, scattering of sound by sound), is still a researched area in the study of sound.

Mathematical description of one-dimensional waves

Wave equation

Main articles: Wave equation and D'Alembert's formula

Consider a traveling transverse wave (which may be a pulse) on a string (the medium). Consider the string to have a single spatial dimension. Consider this wave as traveling

Wavelength λ, can be measured between any two corresponding points on a waveform

in the $x$ direction in space. E.g., let the positive $x$ direction be to the right, and the negative $x$ direction be to the left.
with constant amplitude $u$
with constant velocity , where is
- independent of wavelength (no dispersion)
- independent of amplitude (linear media, not nonlinear).^[2]
with constant waveform, or shape

This wave can then be described by the two-dimensional functions

$u(x,t) = F(x - v \ t)$ (waveform $F$ traveling to the right)

$u(x,t) = G(x + v \ t)$ (waveform $G$ traveling to the left)

or, more generally, by d'Alembert's formula:^[3]

$u(x,t) = F(x-vt) + G(x+vt). \,$

representing two component waveforms $F$ and $G$ traveling through the medium in opposite directions. A generalized representation of this wave can be obtained^[4] as the partial differential equation

$\frac{1}{v^2}\frac{\partial^2 u}{\partial t^2}=\frac{\partial^2 u}{\partial x^2}. \,$

General solutions are based upon Duhamel's principle.^[5]

Wave forms

Sine, square, triangle and sawtooth waveforms.

The form or shape of F in d'Alembert's formula involves the argument x − vt. Constant values of this argument correspond to constant values of F, and these constant values occur if x increases at the same rate that vt increases. That is, the wave shaped like the function F will move in the positive x-direction at velocity v (and G will propagate at the same speed in the negative x-direction).^[6]
In the case of a periodic function F with period λ, that is, F(x + λ − vt) = F(x − vt), the periodicity of F in space means that a snapshot of the wave at a given time t finds the wave varying periodically in space with period λ (the wavelength of the wave). In a similar fashion, this periodicity of F implies a periodicity in time as well: F(x − v(t + T)) = F(x − vt) provided vT = λ, so an observation of the wave at a fixed location x finds the wave undulating periodically in time with period T = λ/v.^[7]

Amplitude and modulation

Illustration of the envelope (the slowly varying red curve) of an amplitude-modulated wave. The fast varying blue curve is the carrier wave, which is being modulated.

Main article: Amplitude modulation

See also: Frequency modulation and Phase modulation

The amplitude of a wave may be constant (in which case the wave is a c.w. or continuous wave), or may be modulated so as to vary with time and/or position. The outline of the variation in amplitude is called the envelope of the wave. Mathematically, the modulated wave can be written in the form:^[8]^[9]^[10]

$u(x,t) = A(x,t)\sin (kx - \omega t + \phi) \ ,$

where $A(x,\ t)$ is the amplitude envelope of the wave, $k$ is the wavenumber and $\phi$ is the phase. If the group velocity $v_g$ (see below) is wavelength-independent, this equation can be simplified as:^[11]

$u(x,t) = A(x - v_g \ t)\sin (kx - \omega t + \phi) \ ,$

showing that the envelope moves with the group velocity and retains its shape. Otherwise, in cases where the group velocity varies with wavelength, the pulse shape changes in a manner often described using an envelope equation.^[11]^[12]

Phase velocity and group velocity

Frequency dispersion in groups of gravity waves on the surface of deep water. The red dot moves with the phase velocity, and the green dots propagate with the group velocity.

Main articles: Phase velocity and Group velocity

There are two velocities that are associated with waves, the phase velocity and the group velocity. To understand them, one must consider several types of waveform. For simplification, examination is restricted to one dimension.

This shows a wave with the Group velocity and Phase velocity going in different directions.

The most basic wave (a form of plane wave) may be expressed in the form:

$\psi (x,t) = A e^{i \left( kx - \omega t \right)} \ ,$

which can be related to the usual sine and cosine forms using Euler's formula. Rewriting the argument, $kx-\omega t = \left(\frac{2\pi}{\lambda}\right)(x - vt)$ , makes clear that this expression describes a vibration of wavelength $\lambda = \frac{2\pi}{k}$ traveling in the x-direction with a constant phase velocity $v_p = \frac{\omega}{k}\,$ .^[13]
The other type of wave to be considered is one with localized structure described by an envelope, which may be expressed mathematically as, for example:

$\psi (x,t) = \int_{-\infty} ^{\infty}\ dk_1 \ A(k_1)\ e^{i\left(k_1x - \omega t \right)} \ ,$

where now A(k₁) (the integral is the inverse fourier transform of A(k1)) is a function exhibiting a sharp peak in a region of wave vectors Δk surrounding the point k₁ = k. In exponential form:

$A = A_o (k_1) e^ {i \alpha (k_1)} \ ,$

with A_o the magnitude of A. For example, a common choice for A_o is a Gaussian wave packet:^[14]

$A_o (k_1) = N\ e^{-\sigma^2 (k_1-k)^2 / 2} \ ,$

where σ determines the spread of k₁-values about k, and N is the amplitude of the wave.
The exponential function inside the integral for ψ oscillates rapidly with its argument, say φ(k₁), and where it varies rapidly, the exponentials cancel each other out, interfere destructively, contributing little to ψ.^[13] However, an exception occurs at the location where the argument φ of the exponential varies slowly. (This observation is the basis for the method of stationary phase for evaluation of such integrals.^[15]) The condition for φ to vary slowly is that its rate of change with k₁ be small; this rate of variation is:^[13]

$\left . \frac{d \varphi }{d k_1} \right | _{k_1 = k } = x - t \left . \frac{d \omega}{dk_1}\right | _{k_1 = k } +\left . \frac{d \alpha}{d k_1}\right | _{k_1 = k } \ ,$

where the evaluation is made at k₁ = k because A(k₁) is centered there. This result shows that the position x where the phase changes slowly, the position where ψ is appreciable, moves with time at a speed called the group velocity:

$v_g = \frac{d \omega}{dk} \ .$

The group velocity therefore depends upon the dispersion relation connecting ω and k. For example, in quantum mechanics the energy of a particle represented as a wave packet is E = ħω = (ħk)²/(2m). Consequently, for that wave situation, the group velocity is

$v_g = \frac {\hbar k}{m} \ ,$

showing that the velocity of a localized particle in quantum mechanics is its group velocity.^[13] Because the group velocity varies with k, the shape of the wave packet broadens with time, and the particle becomes less localized.^[16] In other words, the velocity of the constituent waves of the wave packet travel at a rate that varies with their wavelength, so some move faster than others, and they cannot maintain the same interference pattern as the wave propagates.

Sinusoidal waves

Sinusoidal waves correspond to simple harmonic motion.

Mathematically, the most basic wave is the (spatially) one-dimensional sine wave (or harmonic wave or sinusoid) with an amplitude $u$ described by the equation:

$u(x,t)= A \sin (kx- \omega t + \phi) \ ,$

where

$A$ is the maximum amplitude of the wave, maximum distance from the highest point of the disturbance in the medium (the crest) to the equilibrium point during one wave cycle. In the illustration to the right, this is the maximum vertical distance between the baseline and the wave.
$x$ is the space coordinate
$t$ is the time coordinate
$k$ is the wavenumber
$\omega$ is the angular frequency
$\phi$ is the phase constant.

The units of the amplitude depend on the type of wave. Transverse mechanical waves (e.g., a wave on a string) have an amplitude expressed as a distance (e.g., meters), longitudinal mechanical waves (e.g., sound waves) use units of pressure (e.g., pascals), and electromagnetic waves (a form of transverse vacuum wave) express the amplitude in terms of its electric field (e.g., volts/meter).
The wavelength $\lambda$ is the distance between two sequential crests or troughs (or other equivalent points), generally is measured in meters. A wavenumber $k$ , the spatial frequency of the wave in radians per unit distance (typically per meter), can be associated with the wavelength by the relation

$k = \frac{2 \pi}{\lambda}. \,$

The period $T$ is the time for one complete cycle of an oscillation of a wave. The frequency $f$ is the number of periods per unit time (per second) and is typically measured in hertz. These are related by:

$f=\frac{1}{T}. \,$

In other words, the frequency and period of a wave are reciprocals.
The angular frequency $\omega$ represents the frequency in radians per second. It is related to the frequency or period by

$\omega = 2 \pi f = \frac{2 \pi}{T}. \,$

The wavelength $\lambda$ of a sinusoidal waveform traveling at constant speed $v$ is given by:^[17]

$\lambda = \frac{v}{f},$

where $v$ is called the phase speed (magnitude of the phase velocity) of the wave and $f$ is the wave's frequency.
Wavelength can be a useful concept even if the wave is not periodic in space. For example, in an ocean wave approaching shore, the incoming wave undulates with a varying local wavelength that depends in part on the depth of the sea floor compared to the wave height. The analysis of the wave can be based upon comparison of the local wavelength with the local water depth.^[18]
Although arbitrary wave shapes will propagate unchanged in lossless linear time-invariant systems, in the presence of dispersion the sine wave is the unique shape that will propagate unchanged but for phase and amplitude, making it easy to analyze.^[19] Due to the Kramers–Kronig relations, a linear medium with dispersion also exhibits loss, so the sine wave propagating in a dispersive medium is attenuated in certain frequency ranges that depend upon the medium.^[20] The sine function is periodic, so the sine wave or sinusoid has a wavelength in space and a period in time.^[21]^[22]
The sinusoid is defined for all times and distances, whereas in physical situations we usually deal with waves that exist for a limited span in space and duration in time. Fortunately, an arbitrary wave shape can be decomposed into an infinite set of sinusoidal waves by the use of Fourier analysis. As a result, the simple case of a single sinusoidal wave can be applied to more general cases.^[23]^[24] In particular, many media are linear, or nearly so, so the calculation of arbitrary wave behavior can be found by adding up responses to individual sinusoidal waves using the superposition principle to find the solution for a general waveform.^[25] When a medium is nonlinear, the response to complex waves cannot be determined from a sine-wave decomposition.

Plane waves

Main article: Plane wave

Standing waves

Main articles: Standing wave, Acoustic resonance, Helmholtz resonator, and Organ pipe

Standing wave in stationary medium. The red dots represent the wave nodes

A standing wave, also known as a stationary wave, is a wave that remains in a constant position. This phenomenon can occur because the medium is moving in the opposite direction to the wave, or it can arise in a stationary medium as a result of interference between two waves traveling in opposite directions.
The sum of two counter-propagating waves (of equal amplitude and frequency) creates a standing wave. Standing waves commonly arise when a boundary blocks further propagation of the wave, thus causing wave reflection, and therefore introducing a counter-propagating wave. For example when a violin string is displaced, transverse waves propagate out to where the string is held in place at the bridge and the nut, where the waves are reflected back. At the bridge and nut, the two opposed waves are in antiphase and cancel each other, producing a node. Halfway between two nodes there is an antinode, where the two counter-propagating waves enhance each other maximally. There is no net propagation of energy over time.

One-dimensional standing waves; the fundamental mode and the first 5 overtones.
A two-dimensional standing wave on a disk; this is the fundamental mode.
A standing wave on a disk with two nodal lines crossing at the center; this is an overtone.

Physical properties

Light beam exhibiting reflection, refraction, transmission and dispersion when encountering a prism

Waves exhibit common behaviors under a number of standard situations, e. g.

Transmission and media

Main articles: Rectilinear propagation, Transmittance, and Transmission medium

Waves normally move in a straight line (i.e. rectilinearly) through a transmission medium. Such media can be classified into one or more of the following categories:

A bounded medium if it is finite in extent, otherwise an unbounded medium
A linear medium if the amplitudes of different waves at any particular point in the medium can be added
A uniform medium or homogeneous medium if its physical properties are unchanged at different locations in space
An anisotropic medium if one or more of its physical properties differ in one or more directions
An isotropic medium if its physical properties are the same in all directions

Absorption

Main articles: Absorption (acoustics) and Absorption (electromagnetic radiation)

Absorption of waves mean, if a kind of wave strikes a matter, it will be absorbed by the matter. When a wave with that same natural frequency impinges upon an atom, then the electrons of that atom will be set into vibrational motion. If a wave of a given frequency strikes a material with electrons having the same vibrational frequencies, then those electrons will absorb the energy of the wave and transform it into vibrational motion.

Reflection

Main article: Reflection (physics)

When a wave strikes a reflective surface, it changes direction, such that the angle made by the incident wave and line normal to the surface equals the angle made by the reflected wave and the same normal line.

Interference

Main article: Interference (wave propagation)

Waves that encounter each other combine through superposition to create a new wave called an interference pattern. Important interference patterns occur for waves that are in phase.

Refraction

Main article: Refraction

Sinusoidal traveling plane wave entering a region of lower wave velocity at an angle, illustrating the decrease in wavelength and change of direction (refraction) that results.

Refraction is the phenomenon of a wave changing its speed. Mathematically, this means that the size of the phase velocity changes. Typically, refraction occurs when a wave passes from one medium into another. The amount by which a wave is refracted by a material is given by the refractive index of the material. The directions of incidence and refraction are related to the refractive indices of the two materials by Snell's law.

Diffraction

Main article: Diffraction

A wave exhibits diffraction when it encounters an obstacle that bends the wave or when it spreads after emerging from an opening. Diffraction effects are more pronounced when the size of the obstacle or opening is comparable to the wavelength of the wave.

Polarization

Main article: Polarization (waves)

A wave is polarized if it oscillates in one direction or plane. A wave can be polarized by the use of a polarizing filter. The polarization of a transverse wave describes the direction of oscillation in the plane perpendicular to the direction of travel.
Longitudinal waves such as sound waves do not exhibit polarization. For these waves the direction of oscillation is along the direction of travel.

Dispersion

Schematic of light being dispersed by a prism. Click to see animation.

Main articles: Dispersion (optics) and Dispersion (water waves)

A wave undergoes dispersion when either the phase velocity or the group velocity depends on the wave frequency. Dispersion is most easily seen by letting white light pass through a prism, the result of which is to produce the spectrum of colours of the rainbow. Isaac Newton performed experiments with light and prisms, presenting his findings in the Opticks (1704) that white light consists of several colours and that these colours cannot be decomposed any further.^[26]

Mechanical waves

Main article: Mechanical wave

Waves on strings

Main article: Vibrating string

The speed of a transverse wave traveling along a vibrating string ( v ) is directly proportional to the square root of the tension of the string ( T ) over the linear mass density ( μ ):

$v=\sqrt{\frac{T}{\mu}}, \,$

where the linear density μ is the mass per unit length of the string.

Acoustic waves

Acoustic or sound waves travel at speed given by

$v=\sqrt{\frac{B}{\rho_0}}, \,$

or the square root of the adiabatic bulk modulus divided by the ambient fluid density (see speed of sound).

Water waves

Main article: Water waves

Ripples on the surface of a pond are actually a combination of transverse and longitudinal waves; therefore, the points on the surface follow orbital paths.
Sound—a mechanical wave that propagates through gases, liquids, solids and plasmas;
Inertial waves, which occur in rotating fluids and are restored by the Coriolis effect;
Ocean surface waves, which are perturbations that propagate through water.

Seismic waves

Main article: Seismic waves

Shock waves

Main article: Shock wave

See also: Sonic boom and Cherenkov radiation

Other

Waves of traffic, that is, propagation of different densities of motor vehicles, and so forth, which can be modeled as kinematic waves^[27]

Metachronal wave refers to the appearance of a traveling wave produced by coordinated sequential actions.

It is worth noting that the mass-energy equivalence equation can be solved for this form: $c=\sqrt{\frac{e}{m}}$ .

Electromagnetic waves

Main articles: Electromagnetic radiation and Electromagnetic spectrum

(radio, micro, infrared, visible, uv)
An electromagnetic wave consists of two waves that are oscillations of the electric and magnetic fields. An electromagnetic wave travels in a direction that is at right angles to the oscillation direction of both fields. In the 19th century, James Clerk Maxwell showed that, in vacuum, the electric and magnetic fields satisfy the wave equation both with speed equal to that of the speed of light. From this emerged the idea that light is an electromagnetic wave. Electromagnetic waves can have different frequencies (and thus wavelengths), giving rise to various types of radiation such as radio waves, microwaves, infrared, visible light, ultraviolet and X-rays.

Quantum mechanical waves

Main article: Schrödinger equation

de Broglie waves

Main articles: Wave packet and Matter wave

Louis de Broglie postulated that all particles with momentum have a wavelength

$\lambda = \frac{h}{p},$

where h is Planck's constant, and p is the magnitude of the momentum of the particle. This hypothesis was at the basis of quantum mechanics. Nowadays, this wavelength is called the de Broglie wavelength. For example, the electrons in a CRT display have a de Broglie wavelength of about 10⁻¹³ m.
A wave representing such a particle traveling in the k-direction is expressed by the wave function as follows:

$\psi (\mathbf{r}, \ t=0) =A\ e^{i\mathbf{k \cdot r}} \ ,$

where the wavelength is determined by the wave vector k as:

$\lambda = \frac {2 \pi}{k} \ ,$

and the momentum by:

$\mathbf{p} = \hbar \mathbf{k} \ .$

However, a wave like this with definite wavelength is not localized in space, and so cannot represent a particle localized in space. To localize a particle, de Broglie proposed a superposition of different wavelengths ranging around a central value in a wave packet,^[29] a waveform often used in quantum mechanics to describe the wave function of a particle. In a wave packet, the wavelength of the particle is not precise, and the local wavelength deviates on either side of the main wavelength value.
In representing the wave function of a localized particle, the wave packet is often taken to have a Gaussian shape and is called a Gaussian wave packet.^[30] Gaussian wave packets also are used to analyze water waves.^[31]
For example, a Gaussian wavefunction ψ might take the form:^[32]

$\psi(x,\ t=0) = A\ \exp \left( -\frac{x^2}{2\sigma^2} + i k_0 x \right) \ ,$

at some initial time t = 0, where the central wavelength is related to the central wave vector k₀ as λ₀ = 2π / k₀. It is well known from the theory of Fourier analysis,^[33] or from the Heisenberg uncertainty principle (in the case of quantum mechanics) that a narrow range of wavelengths is necessary to produce a localized wave packet, and the more localized the envelope, the larger the spread in required wavelengths. The Fourier transform of a Gaussian is itself a Gaussian.^[34] Given the Gaussian:

$f(x) = e^{-x^2 / (2\sigma^2)} \ ,$

the Fourier transform is:

$\tilde{ f} (k) = \sigma e^{-\sigma^2 k^2 / 2} \ .$

The Gaussian in space therefore is made up of waves:

$f(x) = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{\infty} \ \tilde{f} (k) e^{ikx} \ dk \ ;$

that is, a number of waves of wavelengths λ such that kλ = 2 π.
The parameter σ decides the spatial spread of the Gaussian along the x-axis, while the Fourier transform shows a spread in wave vector k determined by 1/σ. That is, the smaller the extent in space, the larger the extent in k, and hence in λ = 2π/k.

Animation showing the effect of a cross-polarized gravitational wave on a ring of test particles

Gravitational waves

Main article: Gravitational wave

Researchers believe that gravitational waves also travel through space, although gravitational waves have never been directly detected. Not to be confused with gravity waves, gravitational waves are disturbances in the curvature of spacetime, predicted by Einstein's theory of general relativity.

WKB method

Main article: WKB method

In a nonuniform medium, in which the wavenumber k can depend on the location as well as the frequency, the phase term kx is typically replaced by the integral of k(x)dx, according to the WKB method. Such nonuniform traveling waves are common in many physical problems, including the mechanics of the cochlea and waves on hanging ropes.

See also

NICS.

Mechanics (Greek Μηχανική) is the branch of science concerned with the behavior of physical bodies when subjected to forces or displacements, and the subsequent effects of the bodies on their environment. The scientific discipline has its origins in Ancient Greece with the writings of Aristotle and Archimedes^[1]^[2]^[3] (see History of classical mechanics and Timeline of classical mechanics). During the early modern period, scientists such as Galileo, Kepler, and especially Newton, laid the foundation for what is now known as classical mechanics. It is a branch of classical physics that deals with particles that are either at rest or are moving with velocities significantly less than the speed of light. It can also be defined as a branch of science which deals with the motion of and forces on objects.

Classical versus quantum

Classical mechanics
History Timeline
Branches[hide] Statics Dynamics Kinetics Kinematics Applied mechanics Celestial mechanics Continuum mechanics Statistical mechanics
Formulations[show]
Fundamental concepts[show]
Core topics[show]
Rotational motion [show]
Scientists[show]
v t e

Quantum mechanics
Introduction Glossary · History
Background[show]
Fundamental concepts[show]
Experiments[show]
Formulations[show]
Equations[show]
Interpretations[show]
Advanced topics[show]
Scientists[show]
v t e

The major division of the mechanics discipline separates classical mechanics from quantum mechanics.
Historically, classical mechanics came first, while quantum mechanics is a comparatively recent invention. Classical mechanics originated with Isaac Newton's laws of motion in Principia Mathematica; Quantum Mechanics was discovered in 1925. Both are commonly held to constitute the most certain knowledge that exists about physical nature. Classical mechanics has especially often been viewed as a model for other so-called exact sciences. Essential in this respect is the relentless use of mathematics in theories, as well as the decisive role played by experiment in generating and testing them.
Quantum mechanics is of a wider scope, as it encompasses classical mechanics as a sub-discipline which applies under certain restricted circumstances. According to the correspondence principle, there is no contradiction or conflict between the two subjects, each simply pertains to specific situations. The correspondence principle states that the behavior of systems described by quantum theories reproduces classical physics in the limit of large quantum numbers. Quantum mechanics has superseded classical mechanics at the foundational level and is indispensable for the explanation and prediction of processes at molecular and (sub)atomic level. However, for macroscopic processes classical mechanics is able to solve problems which are unmanageably difficult in quantum mechanics and hence remains useful and well used. Modern descriptions of such behavior begin with a careful definition of such quantities as displacement (distance moved), time, velocity, acceleration, mass, and force. Until about 400 years ago, however, motion was explained from a very different point of view. For example, following the ideas of Greek philosopher and scientist Aristotle, scientists reasoned that a cannonball falls down because its natural position is in the Earth; the sun, the moon, and the stars travel in circles around the earth because it is the nature of heavenly objects to travel in perfect circles.
The Italian physicist and astronomer Galileo brought together the ideas of other great thinkers of his time and began to analyze motion in terms of distance traveled from some starting position and the time that it took. He showed that the speed of falling objects increases steadily during the time of their fall. This acceleration is the same for heavy objects as for light ones, provided air friction (air resistance) is discounted. The English mathematician and physicist Isaac Newton improved this analysis by defining force and mass and relating these to acceleration. For objects traveling at speeds close to the speed of light, Newton’s laws were superseded by Albert Einstein’s theory of relativity. For atomic and subatomic particles, Newton’s laws were superseded by quantum theory. For everyday phenomena, however, Newton’s three laws of motion remain the cornerstone of dynamics, which is the study of what causes motion.

Relativistic versus Newtonian mechanics

In analogy to the distinction between quantum and classical mechanics, Einstein's general and special theories of relativity have expanded the scope of Newton and Galileo's formulation of mechanics. The differences between relativistic and Newtonian mechanics become significant and even dominant as the velocity of a massive body approaches the speed of light. For instance, in Newtonian mechanics, Newton's laws of motion specify that $F=ma$ , whereas in Relativistic mechanics and Lorentz transformations, which were first discovered by Hendrik Lorentz, $F=\gamma ma$ ( $\gamma$ is the Lorentz factor, which is almost equal to 1 for low speeds).

General relativistic versus quantum

Relativistic corrections are also needed for quantum mechanics, although general relativity has not been integrated. The two theories remain incompatible, a hurdle which must be overcome in developing a theory of everything.

History

Main articles: History of classical mechanics and History of quantum mechanics

Antiquity

Main article: Aristotelian mechanics

The main theory of mechanics in antiquity was Aristotelian mechanics.^[4] A later developer in this tradition is Hipparchus.^[5]

Medieval age

Main article: Theory of impetus

Arabic Machine Manuscript. Unknown date (at a guess: 16th to 19th centuries).

In the Middle Ages, Aristotle's theories were criticized and modified by a number of figures, beginning with John Philoponus in the 6th century. A central problem was that of projectile motion, which was discussed by Hipparchus and Philoponus. This led to the development of the theory of impetus by 14th century French Jean Buridan, which developed into the modern theories of inertia, velocity, acceleration and momentum. This work and others was developed in 14th century England by the Oxford Calculators such as Thomas Bradwardine, who studied and formulated various laws regarding falling bodies.
On the question of a body subject to a constant (uniform) force, the 12th century Jewish-Arab Nathanel (Iraqi, of Baghdad) stated that constant force imparts constant acceleration, while the main properties are uniformly accelerated motion (as of falling bodies) was worked out by the 14th century Oxford Calculators.

Early modern age

Two central figures in the early modern age are Galileo Galilei and Isaac Newton. Galileo's final statement of his mechanics, particularly of falling bodies, is his Two New Sciences (1638). Newton's 1687 Philosophiæ Naturalis Principia Mathematica provided a detailed mathematical account of mechanics, using the newly developed mathematics of calculus and providing the basis of Newtonian mechanics.^[5]
There is some dispute over priority of various ideas: Newton's Principia is certainly the seminal work and has been tremendously influential, and the systematic mathematics therein did not and could not have been stated earlier because calculus had not been developed. However, many of the ideas, particularly as pertain to inertia (impetus) and falling bodies had been developed and stated by earlier researchers, both the then-recent Galileo and the less-known medieval predecessors. Precise credit is at times difficult or contentious because scientific language and standards of proof changed, so whether medieval statements are equivalent to modern statements or sufficient proof, or instead similar to modern statements and hypotheses is often debatable.

Modern age

Two main modern developments in mechanics are general relativity of Einstein, and quantum mechanics, both developed in the 20th century based in part on earlier 19th century ideas.

Types of mechanical bodies

Thus the often-used term body needs to stand for a wide assortment of objects, including particles, projectiles, spacecraft, stars, parts of machinery, parts of solids, parts of fluids (gases and liquids), etc.
Other distinctions between the various sub-disciplines of mechanics, concern the nature of the bodies being described. Particles are bodies with little (known) internal structure, treated as mathematical points in classical mechanics. Rigid bodies have size and shape, but retain a simplicity close to that of the particle, adding just a few so-called degrees of freedom, such as orientation in space.
Otherwise, bodies may be semi-rigid, i.e. elastic, or non-rigid, i.e. fluid. These subjects have both classical and quantum divisions of study.
For instance, the motion of a spacecraft, regarding its orbit and attitude (rotation), is described by the relativistic theory of classical mechanics, while the analogous movements of an atomic nucleus are described by quantum mechanics.

Sub-disciplines in mechanics

The following are two lists of various subjects that are studied in mechanics.
Note that there is also the "theory of fields" which constitutes a separate discipline in physics, formally treated as distinct from mechanics, whether classical fields or quantum fields. But in actual practice, subjects belonging to mechanics and fields are closely interwoven. Thus, for instance, forces that act on particles are frequently derived from fields (electromagnetic or gravitational), and particles generate fields by acting as sources. In fact, in quantum mechanics, particles themselves are fields, as described theoretically by the wave function.

Classical mechanics

Prof. Walter Lewin explains Newton's law of gravitation in MIT course 8.01^[6]

The following are described as forming classical mechanics:

Newtonian mechanics, the original theory of motion (kinematics) and forces (dynamics).
Analytical mechanics is a reformulation of Newtonian mechanics with an emphasis on system energy, rather than on forces. There are two main branches of analytical mechanics:
- Hamiltonian mechanics, a theoretical formalism, based on the principle of conservation of energy.
- Lagrangian mechanics, another theoretical formalism, based on the principle of the least action.
Classical statistical mechanics generalizes ordinary classical mechanics to consider systems in an unknown state; often used to derive thermodynamic properties.
Celestial mechanics, the motion of bodies in space: planets, comets, stars, galaxies, etc.
Astrodynamics, spacecraft navigation, etc.
Solid mechanics, elasticity, the properties of deformable bodies.
Fracture mechanics
Acoustics, sound ( = density variation propagation) in solids, fluids and gases.
Statics, semi-rigid bodies in mechanical equilibrium
Fluid mechanics, the motion of fluids
Soil mechanics, mechanical behavior of soils
Continuum mechanics, mechanics of continua (both solid and fluid)
Hydraulics, mechanical properties of liquids
Fluid statics, liquids in equilibrium
Applied mechanics, or Engineering mechanics
Biomechanics, solids, fluids, etc. in biology
Biophysics, physical processes in living organisms
Relativistic or Einsteinian mechanics, universal gravitation.

Quantum mechanics

The following are categorized as being part of quantum mechanics:

Schrödinger wave mechanics, used to describe the motion of the wavefunction of a single particle.
Matrix mechanics is an alternative formulation that allows considering systems with a finite-dimensional state space.
Quantum statistical mechanics generalizes ordinary quantum mechanics to consider systems in an unknown state; often used to derive thermodynamic properties.
Particle physics, the motion, structure, and reactions of particles
Nuclear physics, the motion, structure, and reactions of nuclei
Condensed matter physics, quantum gases, solids, liquids, etc.

Professional organizations

Applied Mechanics Division, American Society of Mechanical Engineers
Fluid Dynamics Division, American Physical Society
Institution of Mechanical Engineers is the United Kingdom's qualifying body for Mechanical Engineers and has been the home of Mechanical Engineers for over 150 years.

International Union of Theoretical and Applied Mechanics

Optics

From Wikipedia, the free encyclopedia

This article is about the branch of physics. For the book by Sir Isaac Newton, see Opticks. For the musical artist, see Optical (artist). For other uses, see Optic (disambiguation).

Optics includes study of dispersion of light.

Optics is the branch of physics which involves the behaviour and properties of light, including its interactions with matter and the construction of instruments that use or detect it.^[1] Optics usually describes the behaviour of visible, ultraviolet, and infrared light. Because light is an electromagnetic wave, other forms of electromagnetic radiation such as X-rays, microwaves, and radio waves exhibit similar properties.^[1]
Most optical phenomena can be accounted for using the classical electromagnetic description of light. Complete electromagnetic descriptions of light are, however, often difficult to apply in practice. Practical optics is usually done using simplified models. The most common of these, geometric optics, treats light as a collection of rays that travel in straight lines and bend when they pass through or reflect from surfaces. Physical optics is a more comprehensive model of light, which includes wave effects such as diffraction and interference that cannot be accounted for in geometric optics. Historically, the ray-based model of light was developed first, followed by the wave model of light. Progress in electromagnetic theory in the 19th century led to the discovery that light waves were in fact electromagnetic radiation.
Some phenomena depend on the fact that light has both wave-like and particle-like properties. Explanation of these effects requires quantum mechanics. When considering light's particle-like properties, the light is modelled as a collection of particles called "photons". Quantum optics deals with the application of quantum mechanics to optical systems.
Optical science is relevant to and studied in many related disciplines including astronomy, various engineering fields, photography, and medicine (particularly ophthalmology and optometry). Practical applications of optics are found in a variety of technologies and everyday objects, including mirrors, lenses, telescopes, microscopes, lasers, and fibre optics.

Contents

1 History

2 Classical optics

2.1 Geometrical optics

2.1.1 Approximations

2.1.2 Reflections

2.1.3 Refractions

2.2 Physical optics

2.2.1 Modelling and design of optical systems using physical optics

2.2.2 Superposition and interference

2.2.3 Diffraction and optical resolution

2.2.4 Dispersion and scattering

2.2.5 Polarization

2.2.5.1 Changing polarization

2.2.5.2 Natural light

3 Modern optics

3.1 Lasers

3.2 Kapitsa–Dirac effect

4 Applications

4.1 Human eye

4.1.1 Visual effects

4.1.2 Optical instruments

4.2 Photography

4.3 Atmospheric optics

5 See also

6 References

7 External links

History

Main article: History of optics

See also: Timeline of electromagnetism and classical optics

The Nimrud lens

Optics began with the development of lenses by the ancient Egyptians and Mesopotamians. The earliest known lenses, made from polished crystal, often quartz, date from as early as 700 BC for Assyrian lenses such as the Layard/Nimrud lens.^[2] The ancient Romans and Greeks filled glass spheres with water to make lenses. These practical developments were followed by the development of theories of light and vision by ancient Greek and Indian philosophers, and the development of geometrical optics in the Greco-Roman world. The word optics comes from the ancient Greek word ὀπτική, meaning "appearance, look".^[3]
Greek philosophy on optics broke down into two opposing theories on how vision worked, the "intro-mission theory" and the "emission theory".^[4] The intro-mission approach saw vision as coming from objects casting off copies of themselves (called eidola) that were captured by the eye. With many propagators including Democritus, Epicurus, Aristotle and their followers, this theory seems to have some contact with modern theories of what vision really is, but it remained only speculation lacking any experimental foundation.
Plato first articulated the emission theory, the idea that visual perception is accomplished by rays emitted by the eyes. He also commented on the parity reversal of mirrors in Timaeus.^[5] Some hundred years later, Euclid wrote a treatise entitled Optics where he linked vision to geometry, creating geometrical optics.^[6] He based his work on Plato's emission theory wherein he described the mathematical rules of perspective and described the effects of refraction qualitatively, although he questioned that a beam of light from the eye could instantaneously light up the stars every time someone blinked.^[7] Ptolemy, in his treatise Optics, held an extramission-intromission theory of vision: the rays (or flux) from the eye formed a cone, the vertex being within the eye, and the base defining the visual field. The rays were sensitive, and conveyed information back to the observer’s intellect about the distance and orientation of surfaces. He summarised much of Euclid and went on to describe a way to measure the angle of refraction, though he failed to notice the empirical relationship between it and the angle of incidence.^[8]

Reproduction of a page of Ibn Sahl's manuscript showing his knowledge of the law of refraction, now known as Snell's law

During the Middle Ages, Greek ideas about optics were resurrected and extended by writers in the Muslim world. One of the earliest of these was Al-Kindi (c. 801–73) who wrote on the merits of Aristotelian and Euclidean ideas of optics, favouring the emission theory since it could better quantify optical phenomenon.^[9] In 984, the Persian mathematician Ibn Sahl wrote the treatise "On burning mirrors and lenses", correctly describing a law of refraction equivalent to Snell's law.^[10] He used this law to compute optimum shapes for lenses and curved mirrors. In the early 11th century, Alhazen (Ibn al-Haytham) wrote the Book of Optics (Kitab al-manazir) in which he explored reflection and refraction and proposed a new system for explaining vision and light based on observation and experiment.^[11]^[12]^[13]^[14]^[15] He rejected the "emission theory" of Ptolemaic optics with its rays being emitted by the eye, and instead put forward the idea that light reflected in all directions in straight lines from all points of the objects being viewed and then entered the eye, although he was unable to correctly explain how the eye captured the rays.^[16] Alhazen's work was largely ignored in the Arabic world but it was anonymously translated into Latin around 1200 A.D. and further summarised and expanded on by the Polish monk Witelo^[17] making it a standard text on optics in Europe for the next 400 years.
In the 13th century medieval Europe the English bishop Robert Grosseteste wrote on a wide range of scientific topics discussing light from four different perspectives: an epistemology of light, a metaphysics or cosmogony of light, an etiology or physics of light, and a theology of light,^[18] basing it on the works Aristotle and Platonism. Grosseteste's most famous disciple, Roger Bacon, wrote works citing a wide range of recently translated optical and philosophical works, including those of Alhazen, Aristotle, Avicenna, Averroes, Euclid, al-Kindi, Ptolemy, Tideus, and Constantine the African. Bacon was able to use parts of glass spheres as magnifying glasses to demonstrate that light reflects from objects rather than being released from them.
In Italy, around 1284, Salvino D'Armate invented the first wearable eyeglasses.^[19] This was the start of the optical industry of grinding and polishing lenses for these "spectacles", first in Venice and Florence in the thirteenth century,^[20] and later in the spectacle making centres in both the Netherlands and Germany.^[21] Spectacle makers created improved types of lenses for the correction of vision based more on empirical knowledge gained from observing the effects of the lenses rather than using the rudimentary optical theory of the day (theory which for the most part could not even adequately explain how spectacles worked).^[22]^[23] This practical development, mastery, and experimentation with lenses led directly to the invention of the compound optical microscope around 1595, and the refracting telescope in 1608, both of which appeared in the spectacle making centres in the Netherlands.^[24]^[25]
In the early 17th century Johannes Kepler expanded on geometric optics in his writings, covering lenses, reflection by flat and curved mirrors, the principles of pinhole cameras, inverse-square law governing the intensity of light, and the optical explanations of astronomical phenomena such as lunar and solar eclipses and astronomical parallax. He was also able to correctly deduce the role of the retina as the actual organ that recorded images, finally being able to scientifically quantify the effects of different types of lenses that spectacle makers had been observing over the previous 300 years.^[26] After the invention of the telescope Kepler set out the theoretical basis on how they worked and described an improved version, known as the Keplerian telescope, using two convex lenses to produce higher magnification.^[27]

Cover of the first edition of Newton's Opticks

Optical theory progressed in the mid-17th century with treatises written by philosopher René Descartes, which explained a variety of optical phenomena including reflection and refraction by assuming that light was emitted by objects which produced it.^[28] This differed substantively from the ancient Greek emission theory. In the late 1660s and early 1670s, Newton expanded Descartes' ideas into a corpuscle theory of light, famously determining that white light was a mix of colours which can be separated into its component parts with a prism. In 1690, Christiaan Huygens proposed a wave theory for light based on suggestions that had been made by Robert Hooke in 1664. Hooke himself publicly criticised Newton's theories of light and the feud between the two lasted until Hooke's death. In 1704, Newton published Opticks and, at the time, partly because of his success in other areas of physics, he was generally considered to be the victor in the debate over the nature of light.^[28]
Newtonian optics was generally accepted until the early 19th century when Thomas Young and Augustin-Jean Fresnel conducted experiments on the interference of light that firmly established light's wave nature. Young's famous double slit experiment showed that light followed the law of superposition, which is a wave-like property not predicted by Newton's corpuscle theory. This work led to a theory of diffraction for light and opened an entire area of study in physical optics.^[29] Wave optics was successfully unified with electromagnetic theory by James Clerk Maxwell in the 1860s.^[30]
The next development in optical theory came in 1899 when Max Planck correctly modelled blackbody radiation by assuming that the exchange of energy between light and matter only occurred in discrete amounts he called quanta.^[31] In 1905 Albert Einstein published the theory of the photoelectric effect that firmly established the quantization of light itself.^[32]^[33] In 1913 Niels Bohr showed that atoms could only emit discrete amounts of energy, thus explaining the discrete lines seen in emission and absorption spectra.^[34] The understanding of the interaction between light and matter which followed from these developments not only formed the basis of quantum optics but also was crucial for the development of quantum mechanics as a whole. The ultimate culmination, the theory of quantum electrodynamics, explains all optics and electromagnetic processes in general as the result of the exchange of real and virtual photons.^[35]
Quantum optics gained practical importance with the inventions of the maser in 1953 and of the laser in 1960.^[36] Following the work of Paul Dirac in quantum field theory, George Sudarshan, Roy J. Glauber, and Leonard Mandel applied quantum theory to the electromagnetic field in the 1950s and 1960s to gain a more detailed understanding of photodetection and the statistics of light.

Classical optics
Classical optics is divided into two main branches: geometrical optics and physical optics. In geometrical, or ray optics, light is considered to travel in straight lines, and in physical, or wave optics, light is considered to be an electromagnetic wave.
Geometrical optics can be viewed as an approximation of physical optics which can be applied when the wavelength of the light used is much smaller than the size of the optical elements or system being modelled.

Geometrical optics

Main article: Geometrical optics

Geometry of reflection and refraction of light rays

Geometrical optics, or ray optics, describes the propagation of light in terms of "rays" which travel in straight lines, and whose paths are governed by the laws of reflection and refraction at interfaces between different media.^[37] These laws were discovered empirically as far back as 984 AD^[10] and have been used in the design of optical components and instruments from then until the present day. They can be summarised as follows:
When a ray of light hits the boundary between two transparent materials, it is divided into a reflected and a refracted ray.

The law of reflection says that the reflected ray lies in the plane of incidence, and the angle of reflection equals the angle of incidence.

The law of refraction says that the refracted ray lies in the plane of incidence, and the sine of the angle of refraction divided by the sine of the angle of incidence is a constant.

$\frac {\sin {\theta_1}}{\sin {\theta_2}} = n$

where $n$ is a constant for any two materials and a given colour of light. It is known as the refractive index.
The laws of reflection and refraction can be derived from Fermat's principle which states that the path taken between two points by a ray of light is the path that can be traversed in the least time.^[38]

Approximations
Geometric optics is often simplified by making the paraxial approximation, or "small angle approximation." The mathematical behaviour then becomes linear, allowing optical components and systems to be described by simple matrices. This leads to the techniques of Gaussian optics and paraxial ray tracing, which are used to find basic properties of optical systems, such as approximate image and object positions and magnifications.^[39]

Reflections

Main article: Reflection (physics)

Diagram of specular reflection

Reflections can be divided into two types: specular reflection and diffuse reflection. Specular reflection describes the gloss of surfaces such as mirrors, which reflect light in a simple, predictable way. This allows for production of reflected images that can be associated with an actual (real) or extrapolated (virtual) location in space. Diffuse reflection describes opaque, non limpid materials, such as paper or rock. The reflections from these surfaces can only be described statistically, with the exact distribution of the reflected light depending on the microscopic structure of the material. Many diffuse reflectors are described or can be approximated by Lambert's cosine law, which describes surfaces that have equal luminance when viewed from any angle. Glossy surfaces can give both specular and diffuse reflection.
In specular reflection, the direction of the reflected ray is determined by the angle the incident ray makes with the surface normal, a line perpendicular to the surface at the point where the ray hits. The incident and reflected rays and the normal lie in a single plane, and the angle between the reflected ray and the surface normal is the same as that between the incident ray and the normal.^[40] This is known as the Law of Reflection.
For flat mirrors, the law of reflection implies that images of objects are upright and the same distance behind the mirror as the objects are in front of the mirror. The image size is the same as the object size. The law also implies that mirror images are parity inverted, which we perceive as a left-right inversion. Images formed from reflection in two (or any even number of) mirrors are not parity inverted. Corner reflectors^[40] retroreflect light, producing reflected rays that travel back in the direction from which the incident rays came.
Mirrors with curved surfaces can be modelled by ray-tracing and using the law of reflection at each point on the surface. For mirrors with parabolic surfaces, parallel rays incident on the mirror produce reflected rays that converge at a common focus. Other curved surfaces may also focus light, but with aberrations due to the diverging shape causing the focus to be smeared out in space. In particular, spherical mirrors exhibit spherical aberration. Curved mirrors can form images with magnification greater than or less than one, and the magnification can be negative, indicating that the image is inverted. An upright image formed by reflection in a mirror is always virtual, while an inverted image is real and can be projected onto a screen.^[40]

Refractions

Main article: Refraction

Illustration of Snell's Law for the case n₁ < n₂, such as air/water interface

Refraction occurs when light travels through an area of space that has a changing index of refraction; this principle allows for lenses and the focusing of light. The simplest case of refraction occurs when there is an interface between a uniform medium with index of refraction $n_1$ and another medium with index of refraction $n_2$ . In such situations, Snell's Law describes the resulting deflection of the light ray:

$n_1\sin\theta_1 = n_2\sin\theta_2\$
where $\theta_1$ and $\theta_2$ are the angles between the normal (to the interface) and the incident and refracted waves, respectively. This phenomenon is also associated with a changing speed of light as seen from the definition of index of refraction provided above which implies:

$v_1\sin\theta_2\ = v_2\sin\theta_1$
where $v_1$ and $v_2$ are the wave velocities through the respective media.^[40]
Various consequences of Snell's Law include the fact that for light rays travelling from a material with a high index of refraction to a material with a low index of refraction, it is possible for the interaction with the interface to result in zero transmission. This phenomenon is called total internal reflection and allows for fibre optics technology. As light signals travel down a fibre optic cable, it undergoes total internal reflection allowing for essentially no light lost over the length of the cable. It is also possible to produce polarised light rays using a combination of reflection and refraction: When a refracted ray and the reflected ray form a right angle, the reflected ray has the property of "plane polarization". The angle of incidence required for such a scenario is known as Brewster's angle.^[40]
Snell's Law can be used to predict the deflection of light rays as they pass through "linear media" as long as the indexes of refraction and the geometry of the media are known. For example, the propagation of light through a prism results in the light ray being deflected depending on the shape and orientation of the prism. Additionally, since different frequencies of light have slightly different indexes of refraction in most materials, refraction can be used to produce dispersion spectra that appear as rainbows. The discovery of this phenomenon when passing light through a prism is famously attributed to Isaac Newton.^[40]
Some media have an index of refraction which varies gradually with position and, thus, light rays curve through the medium rather than travel in straight lines. This effect is what is responsible for mirages seen on hot days where the changing index of refraction of the air causes the light rays to bend creating the appearance of specular reflections in the distance (as if on the surface of a pool of water). Material that has a varying index of refraction is called a gradient-index (GRIN) material and has many useful properties used in modern optical scanning technologies including photocopiers and scanners. The phenomenon is studied in the field of gradient-index optics.^[41]

A ray tracing diagram for a converging lens.

A device which produces converging or diverging light rays due to refraction is known as a lens. Thin lenses produce focal points on either side that can be modelled using the lensmaker's equation.^[42] In general, two types of lenses exist: convex lenses, which cause parallel light rays to converge, and concave lenses, which cause parallel light rays to diverge. The detailed prediction of how images are produced by these lenses can be made using ray-tracing similar to curved mirrors. Similarly to curved mirrors, thin lenses follow a simple equation that determines the location of the images given a particular focal length ( $f$ ) and object distance ( $S_1$ ):

$\frac{1}{S_1} + \frac{1}{S_2} = \frac{1}{f}$
where $S_2$ is the distance associated with the image and is considered by convention to be negative if on the same side of the lens as the object and positive if on the opposite side of the lens.^[42] The focal length f is considered negative for concave lenses.

Incoming parallel rays are focused by a convex lens into an inverted real image one focal length from the lens, on the far side of the lens. Rays from an object at finite distance are focused further from the lens than the focal distance; the closer the object is to the lens, the further the image is from the lens. With concave lenses, incoming parallel rays diverge after going through the lens, in such a way that they seem to have originated at an upright virtual image one focal length from the lens, on the same side of the lens that the parallel rays are approaching on. Rays from an object at finite distance are associated with a virtual image that is closer to the lens than the focal length, and on the same side of the lens as the object. The closer the object is to the lens, the closer the virtual image is to the lens.
Likewise, the magnification of a lens is given by

$M = - \frac{S_2}{S_1} = \frac{f}{f - S_1}$
where the negative sign is given, by convention, to indicate an upright object for positive values and an inverted object for negative values. Similar to mirrors, upright images produced by single lenses are virtual while inverted images are real.^[40]
Lenses suffer from aberrations that distort images and focal points. These are due to both to geometrical imperfections and due to the changing index of refraction for different wavelengths of light (chromatic aberration).^[40]

Images of black letters in a thin convex lens of focal length f are shown in red. Selected rays are shown for letters E, I and K in blue, green and orange, respectively. Note that E (at 2f) has an equal-size, real and inverted image; I (at f) has its image at infinity; and K (at f/2) has a double-size, virtual and upright image.

Physical optics

Main article: Physical optics
In physical optics, light is considered to propagate as a wave. This model predicts phenomena such as interference and diffraction, which are not explained by geometric optics. The speed of light waves in air is approximately 3.0×10⁸ m/s (exactly 299,792,458 m/s in vacuum). The wavelength of visible light waves varies between 400 and 700 nm, but the term "light" is also often applied to infrared (0.7–300 μm) and ultraviolet radiation (10–400 nm).
The wave model can be used to make predictions about how an optical system will behave without requiring an explanation of what is "waving" in what medium. Until the middle of the 19th century, most physicists believed in an "ethereal" medium in which the light disturbance propagated.^[43] The existence of electromagnetic waves was predicted in 1865 by Maxwell's equations. These waves propagate at the speed of light and have varying electric and magnetic fields which are orthogonal to one another, and also to the direction of propagation of the waves.^[44] Light waves are now generally treated as electromagnetic waves except when quantum mechanical effects have to be considered.

Modelling and design of optical systems using physical optics
Many simplified approximations are available for analysing and designing optical systems. Most of these use a single scalar quantity to represent the electric field of the light wave, rather than using a vector model with orthogonal electric and magnetic vectors.^[45] The Huygens–Fresnel equation is one such model. This was derived empirically by Fresnel in 1815, based on Huygen's hypothesis that each point on a wavefront generates a secondary spherical wavefront, which Fresnel combined with the principle of superposition of waves. The Kirchhoff diffraction equation, which is derived using Maxwell's equations, puts the Huygens-Fresnel equation on a firmer physical foundation. Examples of the application of Huygens–Fresnel principle can be found in the sections on diffraction and Fraunhofer diffraction.
More rigorous models, involving the modelling of both electric and magnetic fields of the light wave, are required when dealing with the detailed interaction of light with materials where the interaction depends on their electric and magnetic properties. For instance, the behaviour of a light wave interacting with a metal surface is quite different from what happens when it interacts with a dielectric material. A vector model must also be used to model polarised light.
Numerical modeling techniques such as the finite element method, the boundary element method and the transmission-line matrix method can be used to model the propagation of light in systems which cannot be solved analytically. Such models are computationally demanding and are normally only used to solve small-scale problems that require accuracy beyond that which can be achieved with analytical solutions.^[46]
All of the results from geometrical optics can be recovered using the techniques of Fourier optics which apply many of the same mathematical and analytical techniques used in acoustic engineering and signal processing.
Gaussian beam propagation is a simple paraxial physical optics model for the propagation of coherent radiation such as laser beams. This technique partially accounts for diffraction, allowing accurate calculations of the rate at which a laser beam expands with distance, and the minimum size to which the beam can be focused. Gaussian beam propagation thus bridges the gap between geometric and physical optics.^[47]

Superposition and interference

Main articles: Superposition principle and Interference (optics)
In the absence of nonlinear effects, the superposition principle can be used to predict the shape of interacting waveforms through the simple addition of the disturbances.^[48] This interaction of waves to produce a resulting pattern is generally termed "interference" and can result in a variety of outcomes. If two waves of the same wavelength and frequency are in phase, both the wave crests and wave troughs align. This results in constructive interference and an increase in the amplitude of the wave, which for light is associated with a brightening of the waveform in that location. Alternatively, if the two waves of the same wavelength and frequency are out of phase, then the wave crests will align with wave troughs and vice-versa. This results in destructive interference and a decrease in the amplitude of the wave, which for light is associated with a dimming of the waveform at that location. See below for an illustration of this effect.^[48]

combined
waveform

wave 1

wave 2

Two waves in phase Two waves 180° out
of phase

When oil or fuel is spilled, colourful patterns are formed by thin-film interference.

Since the Huygens–Fresnel principle states that every point of a wavefront is associated with the production of a new disturbance, it is possible for a wavefront to interfere with itself constructively or destructively at different locations producing bright and dark fringes in regular and predictable patterns.^[48] Interferometry is the science of measuring these patterns, usually as a means of making precise determinations of distances or angular resolutions.^[49] The Michelson interferometer was a famous instrument which used interference effects to accurately measure the speed of light.^[50]
The appearance of thin films and coatings is directly affected by interference effects. Antireflective coatings use destructive interference to reduce the reflectivity of the surfaces they coat, and can be used to minimise glare and unwanted reflections. The simplest case is a single layer with thickness one-fourth the wavelength of incident light. The reflected wave from the top of the film and the reflected wave from the film/material interface are then exactly 180° out of phase, causing destructive interference. The waves are only exactly out of phase for one wavelength, which would typically be chosen to be near the centre of the visible spectrum, around 550 nm. More complex designs using multiple layers can achieve low reflectivity over a broad band, or extremely low reflectivity at a single wavelength.
Constructive interference in thin films can create strong reflection of light in a range of wavelengths, which can be narrow or broad depending on the design of the coating. These films are used to make dielectric mirrors, interference filters, heat reflectors, and filters for colour separation in colour television cameras. This interference effect is also what causes the colourful rainbow patterns seen in oil slicks.^[48]

Diffraction and optical resolution

Main articles: Diffraction and Optical resolution

Diffraction on two slits separated by distance $d$ . The bright fringes occur along lines where black lines intersect with black lines and white lines intersect with white lines. These fringes are separated by angle $\theta$ and are numbered as order $n$ .

Diffraction is the process by which light interference is most commonly observed. The effect was first described in 1665 by Francesco Maria Grimaldi, who also coined the term from the Latin diffringere, 'to break into pieces'.^[51]^[52] Later that century, Robert Hooke and Isaac Newton also described phenomena now known to be diffraction in Newton's rings^[53] while James Gregory recorded his observations of diffraction patterns from bird feathers.^[54]
The first physical optics model of diffraction that relied on the Huygens–Fresnel principle was developed in 1803 by Thomas Young in his interference experiments with the interference patterns of two closely spaced slits. Young showed that his results could only be explained if the two slits acted as two unique sources of waves rather than corpuscles.^[55] In 1815 and 1818, Augustin-Jean Fresnel firmly established the mathematics of how wave interference can account for diffraction.^[42]
The simplest physical models of diffraction use equations that describe the angular separation of light and dark fringes due to light of a particular wavelength (λ). In general, the equation takes the form

$m \lambda = d \sin \theta$
where $d$ is the separation between two wavefront sources (in the case of Young's experiments, it was two slits), $\theta$ is the angular separation between the central fringe and the $m$ th order fringe, where the central maximum is $m = 0$ .^[56]
This equation is modified slightly to take into account a variety of situations such as diffraction through a single gap, diffraction through multiple slits, or diffraction through a diffraction grating that contains a large number of slits at equal spacing.^[56] More complicated models of diffraction require working with the mathematics of Fresnel or Fraunhofer diffraction.^[57]
X-ray diffraction makes use of the fact that atoms in a crystal have regular spacing at distances that are on the order of one angstrom. To see diffraction patterns, x-rays with similar wavelengths to that spacing are passed through the crystal. Since crystals are three-dimensional objects rather than two-dimensional gratings, the associated diffraction pattern varies in two directions according to Bragg reflection, with the associated bright spots occurring in unique patterns and $d$ being twice the spacing between atoms.^[56]
Diffraction effects limit the ability for an optical detector to optically resolve separate light sources. In general, light that is passing through an aperture will experience diffraction and the best images that can be created (as described in diffraction-limited optics) appear as a central spot with surrounding bright rings, separated by dark nulls; this pattern is known as an Airy pattern, and the central bright lobe as an Airy disk.^[42] The size of such a disk is given by

$\sin \theta = 1.22 \frac{\lambda}{D}$
where θ is the angular resolution, λ is the wavelength of the light, and D is the diameter of the lens aperture. If the angular separation of the two points is significantly less than the Airy disk angular radius, then the two points cannot be resolved in the image, but if their angular separation is much greater than this, distinct images of the two points are formed and they can therefore be resolved. Rayleigh defined the somewhat arbitrary "Rayleigh criterion" that two points whose angular separation is equal to the Airy disk radius (measured to first null, that is, to the first place where no light is seen) can be considered to be resolved. It can be seen that the greater the diameter of the lens or its aperture, the finer the resolution.^[56] Interferometry, with its ability to mimic extremely large baseline apertures, allows for the greatest angular resolution possible.^[49]
For astronomical imaging, the atmosphere prevents optimal resolution from being achieved in the visible spectrum due to the atmospheric scattering and dispersion which cause stars to twinkle. Astronomers refer to this effect as the quality of astronomical seeing. Techniques known as adaptive optics have been used to eliminate the atmospheric disruption of images and achieve results that approach the diffraction limit.^[58]

Dispersion and scattering

Main articles: Dispersion (optics) and Scattering

Conceptual animation of light dispersion through a prism. High frequency (blue) light is deflected the most, and low frequency (red) the least.

Refractive processes take place in the physical optics limit, where the wavelength of light is similar to other distances, as a kind of scattering. The simplest type of scattering is Thomson scattering which occurs when electromagnetic waves are deflected by single particles. In the limit of Thompson scattering, in which the wavelike nature of light is evident, light is dispersed independent of the frequency, in contrast to Compton scattering which is frequency-dependent and strictly a quantum mechanical process, involving the nature of light as particles. In a statistical sense, elastic scattering of light by numerous particles much smaller than the wavelength of the light is a process known as Rayleigh scattering while the similar process for scattering by particles that are similar or larger in wavelength is known as Mie scattering with the Tyndall effect being a commonly observed result. A small proportion of light scattering from atoms or molecules may undergo Raman scattering, wherein the frequency changes due to excitation of the atoms and molecules. Brillouin scattering occurs when the frequency of light changes due to local changes with time and movements of a dense material.^[59]
Dispersion occurs when different frequencies of light have different phase velocities, due either to material properties (material dispersion) or to the geometry of an optical waveguide (waveguide dispersion). The most familiar form of dispersion is a decrease in index of refraction with increasing wavelength, which is seen in most transparent materials. This is called "normal dispersion". It occurs in all dielectric materials, in wavelength ranges where the material does not absorb light.^[60] In wavelength ranges where a medium has significant absorption, the index of refraction can increase with wavelength. This is called "anomalous dispersion".^[40]^[60]
The separation of colours by a prism is an example of normal dispersion. At the surfaces of the prism, Snell's law predicts that light incident at an angle θ to the normal will be refracted at an angle arcsin(sin (θ) / n). Thus, blue light, with its higher refractive index, is bent more strongly than red light, resulting in the well-known rainbow pattern.^[40]

Dispersion: two sinusoids propagating at different speeds make a moving interference pattern. The red dot moves with the phase velocity, and the green dots propagate with the group velocity. In this case, the phase velocity is twice the group velocity. The red dot overtakes two green dots, when moving from the left to the right of the figure. In effect, the individual waves (which travel with the phase velocity) escape from the wave packet (which travels with the group velocity).

Material dispersion is often characterised by the Abbe number, which gives a simple measure of dispersion based on the index of refraction at three specific wavelengths. Waveguide dispersion is dependent on the propagation constant.^[42] Both kinds of dispersion cause changes in the group characteristics of the wave, the features of the wave packet that change with the same frequency as the amplitude of the electromagnetic wave. "Group velocity dispersion" manifests as a spreading-out of the signal "envelope" of the radiation and can be quantified with a group dispersion delay parameter:

$D = \frac{1}{v_g^2} \frac{dv_g}{d\lambda}$
where $v_g$ is the group velocity.^[61] For a uniform medium, the group velocity is

$v_g = c \left( n - \lambda \frac{dn}{d\lambda} \right)^{-1}$
where n is the index of refraction and c is the speed of light in a vacuum.^[62] This gives a simpler form for the dispersion delay parameter:

$D = - \frac{\lambda}{c} \, \frac{d^2 n}{d \lambda^2}.$
If D is less than zero, the medium is said to have positive dispersion or normal dispersion. If D is greater than zero, the medium has negative dispersion. If a light pulse is propagated through a normally dispersive medium, the result is the higher frequency components slow down more than the lower frequency components. The pulse therefore becomes positively chirped, or up-chirped, increasing in frequency with time. This causes the spectrum coming out of a prism to appear with red light the least refracted and blue/violet light the most refracted. Conversely, if a pulse travels through an anomalously (negatively) dispersive medium, high frequency components travel faster than the lower ones, and the pulse becomes negatively chirped, or down-chirped, decreasing in frequency with time.^[63]
The result of group velocity dispersion, whether negative or positive, is ultimately temporal spreading of the pulse. This makes dispersion management extremely important in optical communications systems based on optical fibres, since if dispersion is too high, a group of pulses representing information will each spread in time and merge, making it impossible to extract the signal.^[61]

Polarization

Main article: Polarization (waves)
Polarization is a general property of waves that describes the orientation of their oscillations. For transverse waves such as many electromagnetic waves, it describes the orientation of the oscillations in the plane perpendicular to the wave's direction of travel. The oscillations may be oriented in a single direction (linear polarization), or the oscillation direction may rotate as the wave travels (circular or elliptical polarization). Circularly polarised waves can rotate rightward or leftward in the direction of travel, and which of those two rotations is present in a wave is called the wave's chirality.^[64]
The typical way to consider polarization is to keep track of the orientation of the electric field vector as the electromagnetic wave propagates. The electric field vector of a plane wave may be arbitrarily divided into two perpendicular components labeled x and y (with z indicating the direction of travel). The shape traced out in the x-y plane by the electric field vector is a Lissajous figure that describes the polarization state.^[42] The following figures show some examples of the evolution of the electric field vector (blue), with time (the vertical axes), at a particular point in space, along with its x and y components (red/left and green/right), and the path traced by the vector in the plane (purple): The same evolution would occur when looking at the electric field at a particular time while evolving the point in space, along the direction opposite to propagation.

Linear

Circular

Elliptical polarization

In the leftmost figure above, the x and y components of the light wave are in phase. In this case, the ratio of their strengths is constant, so the direction of the electric vector (the vector sum of these two components) is constant. Since the tip of the vector traces out a single line in the plane, this special case is called linear polarization. The direction of this line depends on the relative amplitudes of the two components.^[64]
In the middle figure, the two orthogonal components have the same amplitudes and are 90° out of phase. In this case, one component is zero when the other component is at maximum or minimum amplitude. There are two possible phase relationships that satisfy this requirement: the x component can be 90° ahead of the y component or it can be 90° behind the y component. In this special case, the electric vector traces out a circle in the plane, so this polarization is called circular polarization. The rotation direction in the circle depends on which of the two phase relationships exists and corresponds to right-hand circular polarization and left-hand circular polarization.^[42]
In all other cases, where the two components either do not have the same amplitudes and/or their phase difference is neither zero nor a multiple of 90°, the polarization is called elliptical polarization because the electric vector traces out an ellipse in the plane (the polarization ellipse). This is shown in the above figure on the right. Detailed mathematics of polarization is done using Jones calculus and is characterised by the Stokes parameters.^[42]

Changing polarization
Media that have different indexes of refraction for different polarization modes are called birefringent.^[64] Well known manifestations of this effect appear in optical wave plates/retarders (linear modes) and in Faraday rotation/optical rotation (circular modes).^[42] If the path length in the birefringent medium is sufficient, plane waves will exit the material with a significantly different propagation direction, due to refraction. For example, this is the case with macroscopic crystals of calcite, which present the viewer with two offset, orthogonally polarised images of whatever is viewed through them. It was this effect that provided the first discovery of polarization, by Erasmus Bartholinus in 1669. In addition, the phase shift, and thus the change in polarization state, is usually frequency dependent, which, in combination with dichroism, often gives rise to bright colours and rainbow-like effects. In mineralogy, such properties, known as pleochroism, are frequently exploited for the purpose of identifying minerals using polarization microscopes. Additionally, many plastics that are not normally birefringent will become so when subject to mechanical stress, a phenomenon which is the basis of photoelasticity.^[64] Non-birefringent methods, to rotate the linear polarization of light beams, include the use of prismatic polarization rotators which use total internal reflection in a prism set designed for efficient collinear transmission.^[65]

A polariser changing the orientation of linearly polarised light.
In this picture, θ₁ – θ₀ = θ_i.

Media that reduce the amplitude of certain polarization modes are called dichroic. with devices that block nearly all of the radiation in one mode known as polarizing filters or simply "polarisers". Malus' law, which is named after Étienne-Louis Malus, says that when a perfect polariser is placed in a linear polarised beam of light, the intensity, I, of the light that passes through is given by

$I = I_0 \cos^2 \theta_i \quad ,$
where

I₀ is the initial intensity,
and θ_i is the angle between the light's initial polarization direction and the axis of the polariser.^[64]
A beam of unpolarised light can be thought of as containing a uniform mixture of linear polarizations at all possible angles. Since the average value of $\cos^2 \theta$ is 1/2, the transmission coefficient becomes

$\frac {I}{I_0} = \frac {1}{2}\quad$
In practice, some light is lost in the polariser and the actual transmission of unpolarised light will be somewhat lower than this, around 38% for Polaroid-type polarisers but considerably higher (>49.9%) for some birefringent prism types.^[42]
In addition to birefringence and dichroism in extended media, polarization effects can also occur at the (reflective) interface between two materials of different refractive index. These effects are treated by the Fresnel equations. Part of the wave is transmitted and part is reflected, with the ratio depending on angle of incidence and the angle of refraction. In this way, physical optics recovers Brewster's angle.^[42] When light reflects from a thin film on a surface, interference between the reflections from the film's surfaces can produce polarization in the reflected and transmitted light.

Natural light

The effects of a polarising filter on the sky in a photograph. Left picture is taken without polariser. For the right picture, filter was adjusted to eliminate certain polarizations of the scattered blue light from the sky.

Most sources of electromagnetic radiation contain a large number of atoms or molecules that emit light. The orientation of the electric fields produced by these emitters may not be correlated, in which case the light is said to be unpolarised. If there is partial correlation between the emitters, the light is partially polarised. If the polarization is consistent across the spectrum of the source, partially polarised light can be described as a superposition of a completely unpolarised component, and a completely polarised one. One may then describe the light in terms of the degree of polarization, and the parameters of the polarization ellipse.^[42]
Light reflected by shiny transparent materials is partly or fully polarised, except when the light is normal (perpendicular) to the surface. It was this effect that allowed the mathematician Étienne-Louis Malus to make the measurements that allowed for his development of the first mathematical models for polarised light. Polarization occurs when light is scattered in the atmosphere. The scattered light produces the brightness and colour in clear skies. This partial polarization of scattered light can be taken advantage of using polarizing filters to darken the sky in photographs. Optical polarization is principally of importance in chemistry due to circular dichroism and optical rotation ("circular birefringence") exhibited by optically active (chiral) molecules.^[42]

Modern optics

Main articles: Optical physics and Optical engineering
Modern optics encompasses the areas of optical science and engineering that became popular in the 20th century. These areas of optical science typically relate to the electromagnetic or quantum properties of light but do include other topics. A major subfield of modern optics, quantum optics, deals with specifically quantum mechanical properties of light. Quantum optics is not just theoretical; some modern devices, such as lasers, have principles of operation that depend on quantum mechanics. Light detectors, such as photomultipliers and channeltrons, respond to individual photons. Electronic image sensors, such as CCDs, exhibit shot noise corresponding to the statistics of individual photon events. Light-emitting diodes and photovoltaic cells, too, cannot be understood without quantum mechanics. In the study of these devices, quantum optics often overlaps with quantum electronics.^[66]
Specialty areas of optics research include the study of how light interacts with specific materials as in crystal optics and metamaterials. Other research focuses on the phenomenology of electromagnetic waves as in singular optics, non-imaging optics, non-linear optics, statistical optics, and radiometry. Additionally, computer engineers have taken an interest in integrated optics, machine vision, and photonic computing as possible components of the "next generation" of computers.^[67]
Today, the pure science of optics is called optical science or optical physics to distinguish it from applied optical sciences, which are referred to as optical engineering. Prominent subfields of optical engineering include illumination engineering, photonics, and optoelectronics with practical applications like lens design, fabrication and testing of optical components, and image processing. Some of these fields overlap, with nebulous boundaries between the subjects terms that mean slightly different things in different parts of the world and in different areas of industry. A professional community of researchers in nonlinear optics has developed in the last several decades due to advances in laser technology.^[68]

Lasers

Experiments such as this one with high-power lasers are part of the modern optics research.

Main article: Laser
A laser is a device that emits light (electromagnetic radiation) through a process called stimulated emission. The term laser is an acronym for Light Amplification by Stimulated Emission of Radiation.^[69] Laser light is usually spatially coherent, which means that the light either is emitted in a narrow, low-divergence beam, or can be converted into one with the help of optical components such as lenses. Because the microwave equivalent of the laser, the maser, was developed first, devices that emit microwave and radio frequencies are usually called masers.^[70]
The first working laser was demonstrated on 16 May 1960 by Theodore Maiman at Hughes Research Laboratories.^[71] When first invented, they were called "a solution looking for a problem".^[72] Since then, lasers have become a multi-billion dollar industry, finding utility in thousands of highly varied applications. The first application of lasers visible in the daily lives of the general population was the supermarket barcode scanner, introduced in 1974.^[73] The laserdisc player, introduced in 1978, was the first successful consumer product to include a laser, but the compact disc player was the first laser-equipped device to become truly common in consumers' homes, beginning in 1982.^[74] These optical storage devices use a semiconductor laser less than a millimetre wide to scan the surface of the disc for data retrieval. Fibre-optic communication relies on lasers to transmit large amounts of information at the speed of light. Other common applications of lasers include laser printers and laser pointers. Lasers are used in medicine in areas such as bloodless surgery, laser eye surgery, and laser capture microdissection and in military applications such as missile defence systems, electro-optical countermeasures (EOCM), and LIDAR. Lasers are also used in holograms, bubblegrams, laser light shows, and laser hair removal.^[75]

Kapitsa–Dirac effect
The Kapitsa–Dirac effect causes beams of particles to diffract as the result of meeting a standing wave of light. Light can be used to position matter using various phenomena (see optical tweezers).

Applications
Optics is part of everyday life. The ubiquity of visual systems in biology indicates the central role optics plays as the science of one of the five senses. Many people benefit from eyeglasses or contact lenses, and optics are integral to the functioning of many consumer goods including cameras. Rainbows and mirages are examples of optical phenomena. Optical communication provides the backbone for both the Internet and modern telephony.

Human eye

Model of a human eye. Features mentioned in this article are 3. ciliary muscle, 6. pupil, 8. cornea, 10. lens cortex, 22. optic nerve, 26. fovea, 30. retina

Main articles: Human eye and Photometry (optics)
The human eye functions by focusing light onto a layer of photoreceptor cells called the retina, which forms the inner lining of the back of the eye. The focusing is accomplished by a series of transparent media. Light entering the eye passes first through the cornea, which provides much of the eye's optical power. The light then continues through the fluid just behind the cornea—the anterior chamber, then passes through the pupil. The light then passes through the lens, which focuses the light further and allows adjustment of focus. The light then passes through the main body of fluid in the eye—the vitreous humour, and reaches the retina. The cells in the retina line the back of the eye, except for where the optic nerve exits; this results in a blind spot.
There are two types of photoreceptor cells, rods and cones, which are sensitive to different aspects of light.^[76] Rod cells are sensitive to the intensity of light over a wide frequency range, thus are responsible for black-and-white vision. Rod cells are not present on the fovea, the area of the retina responsible for central vision, and are not as responsive as cone cells to spatial and temporal changes in light. There are, however, twenty times more rod cells than cone cells in the retina because the rod cells are present across a wider area. Because of their wider distribution, rods are responsible for peripheral vision.^[77]
In contrast, cone cells are less sensitive to the overall intensity of light, but come in three varieties that are sensitive to different frequency-ranges and thus are used in the perception of colour and photopic vision. Cone cells are highly concentrated in the fovea and have a high visual acuity meaning that they are better at spatial resolution than rod cells. Since cone cells are not as sensitive to dim light as rod cells, most night vision is limited to rod cells. Likewise, since cone cells are in the fovea, central vision (including the vision needed to do most reading, fine detail work such as sewing, or careful examination of objects) is done by cone cells.^[77]
Ciliary muscles around the lens allow the eye's focus to be adjusted. This process is known as accommodation. The near point and far point define the nearest and farthest distances from the eye at which an object can be brought into sharp focus. For a person with normal vision, the far point is located at infinity. The near point's location depends on how much the muscles can increase the curvature of the lens, and how inflexible the lens has become with age. Optometrists, ophthalmologists, and opticians usually consider an appropriate near point to be closer than normal reading distance—approximately 25 cm.^[76]
Defects in vision can be explained using optical principles. As people age, the lens becomes less flexible and the near point recedes from the eye, a condition known as presbyopia. Similarly, people suffering from hyperopia cannot decrease the focal length of their lens enough to allow for nearby objects to be imaged on their retina. Conversely, people who cannot increase the focal length of their lens enough to allow for distant objects to be imaged on the retina suffer from myopia and have a far point that is considerably closer than infinity. A condition known as astigmatism results when the cornea is not spherical but instead is more curved in one direction. This causes horizontally extended objects to be focused on different parts of the retina than vertically extended objects, and results in distorted images.^[76]
All of these conditions can be corrected using corrective lenses. For presbyopia and hyperopia, a converging lens provides the extra curvature necessary to bring the near point closer to the eye while for myopia a diverging lens provides the curvature necessary to send the far point to infinity. Astigmatism is corrected with a cylindrical surface lens that curves more strongly in one direction than in another, compensating for the non-uniformity of the cornea.^[78]
The optical power of corrective lenses is measured in diopters, a value equal to the reciprocal of the focal length measured in meters; with a positive focal length corresponding to a converging lens and a negative focal length corresponding to a diverging lens. For lenses that correct for astigmatism as well, three numbers are given: one for the spherical power, one for the cylindrical power, and one for the angle of orientation of the astigmatism.^[78]

Visual effects

Main articles: Optical illusions and Perspective (graphical)

For the visual effects used in film, video, and computer graphics, see visual effects.

The Ponzo Illusion relies on the fact that parallel lines appear to converge as they approach infinity.

Optical illusions (also called visual illusions) are characterized by visually perceived images that differ from objective reality. The information gathered by the eye is processed in the brain to give a percept that differs from the object being imaged. Optical illusions can be the result of a variety of phenomena including physical effects that create images that are different from the objects that make them, the physiological effects on the eyes and brain of excessive stimulation (e.g. brightness, tilt, colour, movement), and cognitive illusions where the eye and brain make unconscious inferences.^[79]
Cognitive illusions include some which result from the unconscious misapplication of certain optical principles. For example, the Ames room, Hering, Müller-Lyer, Orbison, Ponzo, Sander, and Wundt illusions all rely on the suggestion of the appearance of distance by using converging and diverging lines, in the same way that parallel light rays (or indeed any set of parallel lines) appear to converge at a vanishing point at infinity in two-dimensionally rendered images with artistic perspective.^[80] This suggestion is also responsible for the famous moon illusion where the moon, despite having essentially the same angular size, appears much larger near the horizon than it does at zenith.^[81] This illusion so confounded Ptolemy that he incorrectly attributed it to atmospheric refraction when he described it in his treatise, Optics.^[8]
Another type of optical illusion exploits broken patterns to trick the mind into perceiving symmetries or asymmetries that are not present. Examples include the café wall, Ehrenstein, Fraser spiral, Poggendorff, and Zöllner illusions. Related, but not strictly illusions, are patterns that occur due to the superimposition of periodic structures. For example transparent tissues with a grid structure produce shapes known as moiré patterns, while the superimposition of periodic transparent patterns comprising parallel opaque lines or curves produces line moiré patterns.^[82]

Optical instruments

Illustrations of various optical instruments from the 1728 Cyclopaedia

Main article: Optical instruments
Single lenses have a variety of applications including photographic lenses, corrective lenses, and magnifying glasses while single mirrors are used in parabolic reflectors and rear-view mirrors. Combining a number of mirrors, prisms, and lenses produces compound optical instruments which have practical uses. For example, a periscope is simply two plane mirrors aligned to allow for viewing around obstructions. The most famous compound optical instruments in science are the microscope and the telescope which were both invented by the Dutch in the late 16th century.^[83]
Microscopes were first developed with just two lenses: an objective lens and an eyepiece. The objective lens is essentially a magnifying glass and was designed with a very small focal length while the eyepiece generally has a longer focal length. This has the effect of producing magnified images of close objects. Generally, an additional source of illumination is used since magnified images are dimmer due to the conservation of energy and the spreading of light rays over a larger surface area. Modern microscopes, known as compound microscopes have many lenses in them (typically four) to optimize the functionality and enhance image stability.^[83] A slightly different variety of microscope, the comparison microscope, looks at side-by-side images to produce a stereoscopic binocular view that appears three dimensional when used by humans.^[84]
The first telescopes, called refracting telescopes were also developed with a single objective and eyepiece lens. In contrast to the microscope, the objective lens of the telescope was designed with a large focal length to avoid optical aberrations. The objective focuses an image of a distant object at its focal point which is adjusted to be at the focal point of an eyepiece of a much smaller focal length. The main goal of a telescope is not necessarily magnification, but rather collection of light which is determined by the physical size of the objective lens. Thus, telescopes are normally indicated by the diameters of their objectives rather than by the magnification which can be changed by switching eyepieces. Because the magnification of a telescope is equal to the focal length of the objective divided by the focal length of the eyepiece, smaller focal-length eyepieces cause greater magnification.^[83]
Since crafting large lenses is much more difficult than crafting large mirrors, most modern telescopes are reflecting telescopes, that is, telescopes that use a primary mirror rather than an objective lens. The same general optical considerations apply to reflecting telescopes that applied to refracting telescopes, namely, the larger the primary mirror, the more light collected, and the magnification is still equal to the focal length of the primary mirror divided by the focal length of the eyepiece. Professional telescopes generally do not have eyepieces and instead place an instrument (often a charge-coupled device) at the focal point instead.^[83]

Photography

Main article: Science of photography

Photograph taken with aperture f/32

Photograph taken with aperture f/5

The optics of photography involves both lenses and the medium in which the electromagnetic radiation is recorded, whether it be a plate, film, or charge-coupled device. Photographers must consider the reciprocity of the camera and the shot which is summarized by the relation

Exposure ∝ ApertureArea × ExposureTime × SceneLuminance^[85]
In other words, the smaller the aperture (giving greater depth of focus), the less light coming in, so the length of time has to be increased (leading to possible blurriness if motion occurs). An example of the use of the law of reciprocity is the Sunny 16 rule which gives a rough estimate for the settings needed to estimate the proper exposure in daylight.^[86]
A camera's aperture is measured by a unitless number called the f-number or f-stop, f/#, often notated as $N$ , and given by

$f/\# = N = \frac fD \$
where $f$ is the focal length, and $D$ is the diameter of the entrance pupil. By convention, "f/#" is treated as a single symbol, and specific values of f/# are written by replacing the number sign with the value. The two ways to increase the f-stop are to either decrease the diameter of the entrance pupil or change to a longer focal length (in the case of a zoom lens, this can be done by simply adjusting the lens). Higher f-numbers also have a larger depth of field due to the lens approaching the limit of a pinhole camera which is able to focus all images perfectly, regardless of distance, but requires very long exposure times.^[87]
The field of view that the lens will provide changes with the focal length of the lens. There are three basic classifications based on the relationship to the diagonal size of the film or sensor size of the camera to the focal length of the lens:^[88]

Normal lens: angle of view of about 50° (called normal because this angle considered roughly equivalent to human vision^[88]) and a focal length approximately equal to the diagonal of the film or sensor.^[89]

Wide-angle lens: angle of view wider than 60° and focal length shorter than a normal lens.^[90]

Long focus lens: angle of view narrower than a normal lens. This is any lens with a focal length longer than the diagonal measure of the film or sensor.^[91] The most common type of long focus lens is the telephoto lens, a design that uses a special telephoto group to be physically shorter than its focal length.^[92]

Modern zoom lenses may have some or all of these attributes.
The absolute value for the exposure time required depends on how sensitive to light the medium being used is (measured by the film speed, or, for digital media, by the quantum efficiency).^[93] Early photography used media that had very low light sensitivity, and so exposure times had to be long even for very bright shots. As technology has improved, so has the sensitivity through film cameras and digital cameras.^[94]
Other results from physical and geometrical optics apply to camera optics. For example, the maximum resolution capability of a particular camera set-up is determined by the diffraction limit associated with the pupil size and given, roughly, by the Rayleigh criterion.^[95]

Atmospheric optics

Main article: Atmospheric optics

A colourful sky is often due to scattering of light off particulates and pollution, as in this photograph of a sunset during the October 2007 California wildfires.

The unique optical properties of the atmosphere cause a wide range of spectacular optical phenomena. The blue colour of the sky is a direct result of Rayleigh scattering which redirects higher frequency (blue) sunlight back into the field of view of the observer. Because blue light is scattered more easily than red light, the sun takes on a reddish hue when it is observed through a thick atmosphere, as during a sunrise or sunset. Additional particulate matter in the sky can scatter different colours at different angles creating colourful glowing skies at dusk and dawn. Scattering off of ice crystals and other particles in the atmosphere are responsible for halos, afterglows, coronas, rays of sunlight, and sun dogs. The variation in these kinds of phenomena is due to different particle sizes and geometries.^[96]
Mirages are optical phenomena in which light rays are bent due to thermal variations in the refraction index of air, producing displaced or heavily distorted images of distant objects. Other dramatic optical phenomena associated with this include the Novaya Zemlya effect where the sun appears to rise earlier than predicted with a distorted shape. A spectacular form of refraction occurs with a temperature inversion called the Fata Morgana where objects on the horizon or even beyond the horizon, such as islands, cliffs, ships or icebergs, appear elongated and elevated, like "fairy tale castles".^[97]
Rainbows are the result of a combination of internal reflection and dispersive refraction of light in raindrops. A single reflection off the backs of an array of raindrops produces a rainbow with an angular size on the sky that ranges from 40° to 42° with red on the outside. Double rainbows are produced by two internal reflections with angular size of 50.5° to 54° with violet on the outside. Because rainbows are seen with the sun 180° away from the centre of the rainbow, rainbows are more prominent the closer the sun is to the horizon.^[64]

Please comment on a proposed amendment regarding undisclosed paid editing.

Special relativity

From Wikipedia, the free encyclopedia

  (Redirected from Special Relativity)

For history and motivation, see History of special relativity.

For a generally accessible and less technical introduction to the topic, see Introduction to special relativity.

Part of a series on

Special relativity

Principle of relativity

Introduction to special relativity

Theory of relativity

Special relativity (alternative formulations)

Foundations and formulation[show]

Consequences[show]

Spacetime [show]

Dynamics [show]

History and precursors[show]

People[show]

v

t

e

In physics, special relativity (SR, also known as the special theory of relativity or STR) is the accepted physical theory regarding the relationship between space and time. It is based on two postulates: (1) that the laws of physics are invariant (i.e., identical) in all inertial systems (non-accelerating frames of reference); and (2) that the speed of light in a vacuum is the same for all observers, regardless of the motion of the light source. It was originally proposed in 1905 by Albert Einstein in the paper "On the Electrodynamics of Moving Bodies".^[1] The inconsistency of classical mechanics with Maxwell’s equations of electromagnetism led to the development of special relativity, which corrects classical mechanics to handle situations involving motions nearing the speed of light. As of today, special relativity is the most accurate model of motion at any speed. Even so, classical mechanics is still useful (due to its simplicity and high accuracy) as an approximation at small velocities relative to the speed of light.
Special relativity implies a wide range of consequences, which have been experimentally verified,^[2] including length contraction, time dilation, relativistic mass, mass–energy equivalence, a universal speed limit, and relativity of simultaneity. It has replaced the conventional notion of an absolute universal time with the notion of a time that is dependent on reference frame and spatial position. Rather than an invariant time interval between two events, there is an invariant spacetime interval. Combined with other laws of physics, the two postulates of special relativity predict the equivalence of mass and energy, as expressed in the mass–energy equivalence formula E = mc², where c is the speed of light in vacuum.^[3]^[4]
A defining feature of special relativity is the replacement of the Galilean transformations of classical mechanics with the Lorentz transformations. Time and space cannot be defined separately from one another. Rather space and time are interwoven into a single continuum known as spacetime. Events that occur at the same time for one observer could occur at different times for another.
The theory is called "special" because it applied the principle of relativity only to the special case of inertial reference frames. Einstein later published a paper on general relativity in 1915 to apply the principle in the general case, that is, to any frame so as to handle general coordinate transformations, and gravitational effects.
As Galilean relativity is now considered an approximation of special relativity valid for low speeds, special relativity is considered an approximation of the theory of general relativity valid for weak gravitational fields. The presence of gravity becomes undetectable at sufficiently small-scale, free-falling conditions. General relativity incorporates noneuclidean geometry, so that the gravitational effects are represented by the geometric curvature of spacetime. Contrarily, special relativity is restricted to flat spacetime. The geometry of spacetime in special relativity is called Minkowski space. A locally Lorentz invariant frame that abides by Special relativity can be defined at sufficiently small scales, even in curved spacetime.
Galileo Galilei had already postulated that there is no absolute and well-defined state of rest (no privileged reference frames), a principle now called Galileo's principle of relativity. Einstein extended this principle so that it accounted for the constant speed of light,^[5] a phenomenon that had been recently observed in the Michelson–Morley experiment. He also postulated that it holds for all the laws of physics, including both the laws of mechanics and of electrodynamics.^[6]

Albert Einstein around 1905

Contents

1 Postulates

2 Lack of an absolute reference frame

3 Reference frames, coordinates and the Lorentz transformation

4 Consequences derived from the Lorentz transformation

4.1 Relativity of simultaneity

4.2 Time dilation

4.3 Length contraction

4.4 Composition of velocities

5 Other consequences

5.1 Thomas rotation

5.2 Equivalence of mass and energy

5.3 How far can one travel from the Earth?

6 Causality and prohibition of motion faster than light

7 Geometry of spacetime

7.1 Comparison between flat Euclidean space and Minkowski space

7.2 3D spacetime

7.3 4D spacetime

8 Physics in spacetime

8.1 Transformations of physical quantities between reference frames

8.2 Metric

8.3 Relativistic kinematics and invariance

8.4 Relativistic dynamics and invariance

9 Relativity and unifying electromagnetism

10 Status

11 Theories of relativity and quantum mechanics

12 See also

13 References

13.1 Textbooks

13.2 Journal articles

14 External links

14.1 Original works

14.2 Special relativity for a general audience (no mathematical knowledge required)

14.3 Special relativity explained (using simple or more advanced mathematics)

14.4 Visualization

Postulates

“ Reflections of this type made it clear to me as long ago as shortly after 1900, i.e., shortly after Planck's trailblazing work, that neither mechanics nor electrodynamics could (except in limiting cases) claim exact validity. Gradually I despaired of the possibility of discovering the true laws by means of constructive efforts based on known facts. The longer and the more desperately I tried, the more I came to the conviction that only the discovery of a universal formal principle could lead us to assured results... How, then, could such a universal principle be found? ”

—Albert Einstein: Autobiographical Notes^[7]

Einstein discerned two fundamental propositions that seemed to be the most assured, regardless of the exact validity of the (then) known laws of either mechanics or electrodynamics. These propositions were the constancy of the speed of light and the independence of physical laws (especially the constancy of the speed of light) from the choice of inertial system. In his initial presentation of special relativity in 1905 he expressed these postulates as:^[1]

The Principle of Relativity – The laws by which the states of physical systems undergo change are not affected, whether these changes of state be referred to the one or the other of two systems in uniform translatory motion relative to each other.^[1]

The Principle of Invariant Light Speed – "... light is always propagated in empty space with a definite velocity [speed] c which is independent of the state of motion of the emitting body." (from the preface).^[1] That is, light in vacuum propagates with the speed c (a fixed constant, independent of direction) in at least one system of inertial coordinates (the "stationary system"), regardless of the state of motion of the light source.

The derivation of special relativity depends not only on these two explicit postulates, but also on several tacit assumptions (made in almost all theories of physics), including the isotropy and homogeneity of space and the independence of measuring rods and clocks from their past history.^[8]
Following Einstein's original presentation of special relativity in 1905, many different sets of postulates have been proposed in various alternative derivations.^[9] However, the most common set of postulates remains those employed by Einstein in his original paper. A more mathematical statement of the Principle of Relativity made later by Einstein, which introduces the concept of simplicity not mentioned above is:

Special principle of relativity: If a system of coordinates K is chosen so that, in relation to it, physical laws hold good in their simplest form, the same laws hold good in relation to any other system of coordinates K' moving in uniform translation relatively to K.^[10]
Henri Poincaré provided the mathematical framework for relativity theory by proving that Lorentz transformations are a subset of his Poincaré group of symmetry transformations. Einstein later derived these transformations from his axioms.
Many of Einstein's papers present derivations of the Lorentz transformation based upon these two principles.^[11]
Einstein consistently based the derivation of Lorentz invariance (the essential core of special relativity) on just the two basic principles of relativity and light-speed invariance. He wrote:

The insight fundamental for the special theory of relativity is this: The assumptions relativity and light speed invariance are compatible if relations of a new type ("Lorentz transformation") are postulated for the conversion of coordinates and times of events... The universal principle of the special theory of relativity is contained in the postulate: The laws of physics are invariant with respect to Lorentz transformations (for the transition from one inertial system to any other arbitrarily chosen inertial system). This is a restricting principle for natural laws...^[7]
Thus many modern treatments of special relativity base it on the single postulate of universal Lorentz covariance, or, equivalently, on the single postulate of Minkowski spacetime.^[12]^[13]
From the principle of relativity alone without assuming the constancy of the speed of light (i.e. using the isotropy of space and the symmetry implied by the principle of special relativity) one can show that the spacetime transformations between inertial frames are either Euclidean, Galilean, or Lorentzian. In the Lorentzian case, one can then obtain relativistic interval conservation and a certain finite limiting speed. Experiments suggest that this speed is the speed of light in vacuum.^[14]^[15]
The constancy of the speed of light was motivated by Maxwell's theory of electromagnetism and the lack of evidence for the luminiferous ether. There is conflicting evidence on the extent to which Einstein was influenced by the null result of the Michelson–Morley experiment.^[16]^[17] In any case, the null result of the Michelson–Morley experiment helped the notion of the constancy of the speed of light gain widespread and rapid acceptance.

Lack of an absolute reference frame
The principle of relativity, which states that there is no preferred inertial reference frame, dates back to Galileo, and was incorporated into Newtonian physics. However, in the late 19th century, the existence of electromagnetic waves led physicists to suggest that the universe was filled with a substance that they called "aether", which would act as the medium through which these waves, or vibrations travelled. The aether was thought to constitute an absolute reference frame against which speeds could be measured, and could be considered fixed and motionless. Aether supposedly possessed some wonderful properties: it was sufficiently elastic to support electromagnetic waves, and those waves could interact with matter, yet it offered no resistance to bodies passing through it. The results of various experiments, including the Michelson–Morley experiment, indicated that the Earth was always 'stationary' relative to the aether – something that was difficult to explain, since the Earth is in orbit around the Sun. Einstein's solution was to discard the notion of an aether and the absolute state of rest. In relativity, any reference frame moving with uniform motion will observe the same laws of physics. In particular, the speed of light in vacuum is always measured to be c, even when measured by multiple systems that are moving at different (but constant) velocities.

Reference frames, coordinates and the Lorentz transformation

Main article: Lorentz transformation

The primed system is in motion relative to the unprimed system with constant speed v only along the x-axis, from the perspective of an observer stationary in the unprimed system. By the principle of relativity, an observer stationary in the primed system will view a likewise construction except that the speed they record will be −v. The changing of the speed of propagation of interaction from infinite in non-relativistic mechanics to a finite value will require a modification of the transformation equations mapping events in one frame to another.

Relativity theory depends on "reference frames". The term reference frame as used here is an observational perspective in space which is not undergoing any change in motion (acceleration), from which a position can be measured along 3 spatial axes. In addition, a reference frame has the ability to determine measurements of the time of events using a 'clock' (any reference device with uniform periodicity).
An event is an occurrence that can be assigned a single unique time and location in space relative to a reference frame: it is a "point" in spacetime. Since the speed of light is constant in relativity in each and every reference frame, pulses of light can be used to unambiguously measure distances and refer back the times that events occurred to the clock, even though light takes time to reach the clock after the event has transpired.
For example, the explosion of a firecracker may be considered to be an "event". We can completely specify an event by its four spacetime coordinates: The time of occurrence and its 3-dimensional spatial location define a reference point. Let's call this reference frame S.
In relativity theory we often want to calculate the position of a point from a different reference point.
Suppose we have a second reference frame S′, whose spatial axes and clock exactly coincide with that of S at time zero, but it is moving at a constant velocity v with respect to S along the x-axis.
Since there is no absolute reference frame in relativity theory, a concept of 'moving' doesn't strictly exist, as everything is always moving with respect to some other reference frame. Instead, any two frames that move at the same speed in the same direction are said to be comoving. Therefore S and S′ are not comoving.
Define the event to have spacetime coordinates (t,x,y,z) in system S and (t′,x′,y′,z′) in S′. Then the Lorentz transformation specifies that these coordinates are related in the following way:

$\begin{align} t' &= \gamma \ (t - vx/c^2) \\ x' &= \gamma \ (x - v t) \\ y' &= y \\ z' &= z , \end{align}$
where

$\gamma = \frac{1}{\sqrt{1 - \frac{v^2}{c^2}}}$
is the Lorentz factor and c is the speed of light in vacuum, and the velocity v of S′ is parallel to the x-axis. The y and z coordinates are unaffected; only the x and t coordinates are transformed. These Lorentz transformations form a one-parameter group of linear mappings, that parameter being called rapidity.
There is nothing special about the x-axis, the transformation can apply to the y or z axes, or indeed in any direction, which can be done by directions parallel to the motion (which are warped by the γ factor) and perpendicular; see main article for details.
A quantity invariant under Lorentz transformations is known as a Lorentz scalar.
Writing the Lorentz transformation and its inverse in terms of coordinate differences, where for instance one event has coordinates (x₁, t₁) and (x′₁, t′₁), another event has coordinates (x₂, t₂) and (x′₂, t′₂), and the differences are defined as

$\begin{array}{ll} \Delta x' = x'_2-x'_1 \ , & \Delta x = x_2-x_1 \ , \\ \Delta t' = t'_2-t'_1 \ , & \Delta t = t_2-t_1 \ , \\ \end{array}$
we get

$\begin{array}{ll} \Delta x' = \gamma \ (\Delta x - v \,\Delta t) \ , & \Delta x = \gamma \ (\Delta x' + v \,\Delta t') \ , \\ \Delta t' = \gamma \ \left(\Delta t - \dfrac{v \,\Delta x}{c^{2}} \right) \ , & \Delta t = \gamma \ \left(\Delta t' + \dfrac{v \,\Delta x'}{c^{2}} \right) \ . \\ \end{array}$
These effects are not merely appearances; they are explicitly related to our way of measuring time intervals between events which occur at the same place in a given coordinate system (called "co-local" events). These time intervals will be different in another coordinate system moving with respect to the first, unless the events are also simultaneous. Similarly, these effects also relate to our measured distances between separated but simultaneous events in a given coordinate system of choice. If these events are not co-local, but are separated by distance (space), they will not occur at the same spatial distance from each other when seen from another moving coordinate system. However, the spacetime interval will be the same for all observers. The underlying reality remains the same. Only our perspective changes.

Consequences derived from the Lorentz transformation

See also: Twin paradox and Relativistic mechanics
The consequences of special relativity can be derived from the Lorentz transformation equations.^[18] These transformations, and hence special relativity, lead to different physical predictions than those of Newtonian mechanics when relative velocities become comparable to the speed of light. The speed of light is so much larger than anything humans encounter that some of the effects predicted by relativity are initially counterintuitive.

Relativity of simultaneity

See also: Relativity of simultaneity

Event B is simultaneous with A in the green reference frame, but it occurred before in the blue frame, and will occur later in the red frame.

Two events happening in two different locations that occur simultaneously in the reference frame of one inertial observer, may occur non-simultaneously in the reference frame of another inertial observer (lack of absolute simultaneity).
From the first equation of the Lorentz transformation in terms of coordinate differences

$\Delta t' = \gamma \left(\Delta t - \frac{v \,\Delta x}{c^{2}} \right)$
it is clear that two events that are simultaneous in frame S (satisfying Δt = 0), are not necessarily simultaneous in another inertial frame S′ (satisfying Δt′ = 0). Only if these events are additionally co-local in frame S (satisfying Δx = 0), will they be simultaneous in another frame S′.

Time dilation

See also: Time dilation
The time lapse between two events is not invariant from one observer to another, but is dependent on the relative speeds of the observers' reference frames (e.g., the twin paradox which concerns a twin who flies off in a spaceship traveling near the speed of light and returns to discover that his or her twin sibling has aged much more).
Suppose a clock is at rest in the unprimed system S. Two different ticks of this clock are then characterized by Δx = 0. To find the relation between the times between these ticks as measured in both systems, the first equation can be used to find:

$\Delta t' = \gamma\, \Delta t$     for events satisfying     $\Delta x = 0 \ .$
This shows that the time (Δt') between the two ticks as seen in the frame in which the clock is moving (S′), is longer than the time (Δt) between these ticks as measured in the rest frame of the clock (S). Time dilation explains a number of physical phenomena; for example, the decay rate of muons produced by cosmic rays impinging on the Earth's atmosphere.^[19]

Length contraction

See also: Lorentz contraction
The dimensions (e.g., length) of an object as measured by one observer may be smaller than the results of measurements of the same object made by another observer (e.g., the ladder paradox involves a long ladder traveling near the speed of light and being contained within a smaller garage).
Similarly, suppose a measuring rod is at rest and aligned along the x-axis in the unprimed system S. In this system, the length of this rod is written as Δx. To measure the length of this rod in the system S′, in which the clock is moving, the distances x′ to the end points of the rod must be measured simultaneously in that system S′. In other words, the measurement is characterized by Δt′ = 0, which can be combined with the fourth equation to find the relation between the lengths Δx and Δx′:

$\Delta x' = \frac{\Delta x}{\gamma}$     for events satisfying     $\Delta t' = 0 \ .$
This shows that the length (Δx′) of the rod as measured in the frame in which it is moving (S′), is shorter than its length (Δx) in its own rest frame (S).

Composition of velocities

See also: Velocity-addition formula
Velocities (speeds) do not simply add. If the observer in S measures an object moving along the x axis at velocity u, then the observer in the S′ system, a frame of reference moving at velocity v in the x direction with respect to S, will measure the object moving with velocity u′ where (from the Lorentz transformations above):

$u'=\frac{dx'}{dt'}=\frac{\gamma \ (dx-v dt)}{\gamma \ (dt-v dx/c^2)}=\frac{(dx/dt)-v}{1-(v/c^2)(dx/dt)}=\frac{u-v}{1-uv/c^2} \ .$
The other frame S will measure:

$u=\frac{dx}{dt}=\frac{\gamma \ (dx'+v dt')}{\gamma \ (dt'+v dx'/c^2)}=\frac{(dx'/dt')+v}{1+(v/c^2)(dx'/dt')}=\frac{u'+v}{1+u'v/c^2} \ .$
Notice that if the object were moving at the speed of light in the S system (i.e. u = c), then it would also be moving at the speed of light in the S′ system. Also, if both u and v are small with respect to the speed of light, we will recover the intuitive Galilean transformation of velocities

$u' \approx u-v \ .$
The usual example given is that of a train (frame S′ above) traveling due east with a velocity v with respect to the tracks (frame S). A child inside the train throws a baseball due east with a velocity u′ with respect to the train. In classical physics, an observer at rest on the tracks will measure the velocity of the baseball (due east) as u = u′ + v, while in special relativity this is no longer true; instead the velocity of the baseball (due east) is given by the second equation: u = (u′ + v)/(1 + u′v/c²). Again, there is nothing special about the x or east directions. This formalism applies to any direction by considering parallel and perpendicular motion to the direction of relative velocity v, see main article for details.
Einstein's addition of colinear velocities is consistent with the Fizeau experiment which determined the speed of light in a fluid moving parallel to the light, but no experiment has ever tested the formula for the general case of non-parallel velocities.^{[citation needed]}

Other consequences

Thomas rotation

See also: Thomas rotation
The orientation of an object (i.e. the alignment of its axes with the observer's axes) may be different for different observers. Unlike other relativistic effects, this effect becomes quite significant at fairly low velocities as can be seen in the spin of moving particles.

Equivalence of mass and energy

Main article: Mass–energy equivalence
As an object's speed approaches the speed of light from an observer's point of view, its relativistic mass increases thereby making it more and more difficult to accelerate it from within the observer's frame of reference.
The energy content of an object at rest with mass m equals mc². Conservation of energy implies that, in any reaction, a decrease of the sum of the masses of particles must be accompanied by an increase in kinetic energies of the particles after the reaction. Similarly, the mass of an object can be increased by taking in kinetic energies.
In addition to the papers referenced above—which give derivations of the Lorentz transformation and describe the foundations of special relativity—Einstein also wrote at least four papers giving heuristic arguments for the equivalence (and transmutability) of mass and energy, for E = mc².
Mass–energy equivalence is a consequence of special relativity. The energy and momentum, which are separate in Newtonian mechanics, form a four-vector in relativity, and this relates the time component (the energy) to the space components (the momentum) in a nontrivial way. For an object at rest, the energy–momentum four-vector is (E, 0, 0, 0): it has a time component which is the energy, and three space components which are zero. By changing frames with a Lorentz transformation in the x direction with a small value of the velocity v, the energy momentum four-vector becomes (E, Ev/c², 0, 0). The momentum is equal to the energy multiplied by the velocity divided by c². As such, the Newtonian mass of an object, which is the ratio of the momentum to the velocity for slow velocities, is equal to E/c².
The energy and momentum are properties of matter and radiation, and it is impossible to deduce that they form a four-vector just from the two basic postulates of special relativity by themselves, because these don't talk about matter or radiation, they only talk about space and time. The derivation therefore requires some additional physical reasoning. In his 1905 paper, Einstein used the additional principles that Newtonian mechanics should hold for slow velocities, so that there is one energy scalar and one three-vector momentum at slow velocities, and that the conservation law for energy and momentum is exactly true in relativity. Furthermore, he assumed that the energy of light is transformed by the same Doppler-shift factor as its frequency, which he had previously shown to be true based on Maxwell's equations.^[1] The first of Einstein's papers on this subject was "Does the Inertia of a Body Depend upon its Energy Content?" in 1905.^[20] Although Einstein's argument in this paper is nearly universally accepted by physicists as correct, even self-evident, many authors over the years have suggested that it is wrong.^[21] Other authors suggest that the argument was merely inconclusive because it relied on some implicit assumptions.^[22]
Einstein acknowledged the controversy over his derivation in his 1907 survey paper on special relativity. There he notes that it is problematic to rely on Maxwell's equations for the heuristic mass–energy argument. The argument in his 1905 paper can be carried out with the emission of any massless particles, but the Maxwell equations are implicitly used to make it obvious that the emission of light in particular can be achieved only by doing work. To emit electromagnetic waves, all you have to do is shake a charged particle, and this is clearly doing work, so that the emission is of energy.^[23]^[24]

How far can one travel from the Earth?

See also: Space travel using constant acceleration
Since one can not travel faster than light, one might conclude that a human can never travel further from Earth than 40 light years if the traveler is active between the age of 20 and 60. One would easily think that a traveler would never be able to reach more than the very few solar systems which exist within the limit of 20–40 light years from the earth. But that would be a mistaken conclusion. Because of time dilation, a hypothetical spaceship can travel thousands of light years during the pilot's 40 active years. If a spaceship could be built that accelerates at a constant 1g, it will after a little less than a year be traveling at almost the speed of light as seen from Earth. Time dilation will increase his life span as seen from the reference system of the Earth, but his lifespan measured by a clock traveling with him will not thereby change. During his journey, people on Earth will experience more time than he does. A 5 year round trip for him will take 6½ Earth years and cover a distance of over 6 light-years. A 20 year round trip for him (5 years accelerating, 5 decelerating, twice each) will land him back on Earth having traveled for 335 Earth years and a distance of 331 light years.^[25] A full 40 year trip at 1 g will appear on Earth to last 58,000 years and cover a distance of 55,000 light years. A 40 year trip at 1.1 g will take 148,000 Earth years and cover about 140,000 light years. A one-way 28 year (14 years accelerating, 14 decelerating as measured with the cosmonaut's clock) trip at 1 g acceleration could reach 2,000,000 light-years to the Andromeda Galaxy.^[26] This same time dilation is why a muon traveling close to c is observed to travel much further than c times its half-life (when at rest).^[27]

Causality and prohibition of motion faster than light

See also: Causality (physics) and Tachyonic antitelephone

Diagram 2. Light cone

In diagram 2 the interval AB is 'time-like'; i.e., there is a frame of reference in which events A and B occur at the same location in space, separated only by occurring at different times. If A precedes B in that frame, then A precedes B in all frames. It is hypothetically possible for matter (or information) to travel from A to B, so there can be a causal relationship (with A the cause and B the effect).
The interval AC in the diagram is 'space-like'; i.e., there is a frame of reference in which events A and C occur simultaneously, separated only in space. There are also frames in which A precedes C (as shown) and frames in which C precedes A. If it were possible for a cause-and-effect relationship to exist between events A and C, then paradoxes of causality would result. For example, if A was the cause, and C the effect, then there would be frames of reference in which the effect preceded the cause. Although this in itself won't give rise to a paradox, one can show^[28]^[29] that faster than light signals can be sent back into one's own past. A causal paradox can then be constructed by sending the signal if and only if no signal was received previously.
Therefore, if causality is to be preserved, one of the consequences of special relativity is that no information signal or material object can travel faster than light in vacuum. However, some "things" can still move faster than light. For example, the location where the beam of a search light hits the bottom of a cloud can move faster than light when the search light is turned rapidly.^[30]
Even without considerations of causality, there are other strong reasons why faster-than-light travel is forbidden by special relativity. For example, if a constant force is applied to an object for a limitless amount of time, then integrating F = dp/dt gives a momentum that grows without bound, but this is simply because $p = m \gamma v$ approaches infinity as $v$ approaches c. To an observer who is not accelerating, it appears as though the object's inertia is increasing, so as to produce a smaller acceleration in response to the same force. This behavior is observed in particle accelerators, where each charged particle is accelerated by the electromagnetic force.
Theoretical and experimental tunneling studies carried out by Günter Nimtz and Petrissa Eckle claimed that under special conditions signals may travel faster than light.^[31]^[32]^[33]^[34] It was measured that fiber digital signals were traveling up to 5 times c and a zero-time tunneling electron carried the information that the atom is ionized, with photons, phonons and electrons spending zero time in the tunneling barrier. According to Nimtz and Eckle, in this superluminal process only the Einstein causality and the special relativity but not the primitive causality are violated: Superluminal propagation does not result in any kind of time travel.^[35]^[36] Several scientists have stated not only that Nimtz' interpretations were erroneous, but also that the experiment actually provided a trivial experimental confirmation of the special relativity theory.^[37]^[38]^[39]

Geometry of spacetime

Main article: Minkowski space

Comparison between flat Euclidean space and Minkowski space

See also: line element

Orthogonality and rotation of coordinate systems compared between left: Euclidean space through circular angle φ, right: in Minkowski spacetime through hyperbolic angle φ (red lines labelled c denote the worldlines of a light signal, a vector is orthogonal to itself if it lies on this line).^[40]

Special relativity uses a 'flat' 4-dimensional Minkowski space – an example of a spacetime. Minkowski spacetime appears to be very similar to the standard 3-dimensional Euclidean space, but there is a crucial difference with respect to time.
In 3D space, the differential of distance (line element) ds is defined by

$ds^2 = d\mathbf{x} \cdot d\mathbf{x} = dx_1^2 + dx_2^2 + dx_3^2,$
where dx = (dx₁, dx₂, dx₃) are the differentials of the three spatial dimensions. In Minkowski geometry, there is an extra dimension with coordinate X⁰ derived from time, such that the distance differential fulfills

$ds^2 = -dX_0^2 + dX_1^2 + dX_2^2 + dX_3^2,$
where dX = (dX₀, dX₁, dX₂, dX₃) are the differentials of the four spacetime dimensions. This suggests a deep theoretical insight: special relativity is simply a rotational symmetry of our spacetime, analogous to the rotational symmetry of Euclidean space (see image right).^[41] Just as Euclidean space uses a Euclidean metric, so spacetime uses a Minkowski metric. Basically, special relativity can be stated as the invariance of any spacetime interval (that is the 4D distance between any two events) when viewed from any inertial reference frame. All equations and effects of special relativity can be derived from this rotational symmetry (the Poincaré group) of Minkowski spacetime.
The actual form of ds above depends on the metric and on the choices for the X⁰ coordinate. To make the time coordinate look like the space coordinates, it can be treated as imaginary: X₀ = ict (this is called a Wick rotation). According to Misner, Thorne and Wheeler (1971, §2.3), ultimately the deeper understanding of both special and general relativity will come from the study of the Minkowski metric (described below) and to take X⁰ = ct, rather than a "disguised" Euclidean metric using ict as the time coordinate.
Some authors use X⁰ = t, with factors of c elsewhere to compensate; for instance, spatial coordinates are divided by c or factors of c^±2 are included in the metric tensor.^[42] These numerous conventions can be superseded by using natural units where c = 1. Then space and time have equivalent units, and no factors of c appear anywhere.

3D spacetime

Three dimensional dual-cone.

Null spherical space.

If we reduce the spatial dimensions to 2, so that we can represent the physics in a 3D space

$ds^2 = dx_1^2 + dx_2^2 - c^2 dt^2,$
we see that the null geodesics lie along a dual-cone (see image right) defined by the equation;

$ds^2 = 0 = dx_1^2 + dx_2^2 - c^2 dt^2$
or simply

$dx_1^2 + dx_2^2 = c^2 dt^2,$
which is the equation of a circle of radius c dt.

4D spacetime
If we extend this to three spatial dimensions, the null geodesics are the 4-dimensional cone:

$ds^2 = 0 = dx_1^2 + dx_2^2 + dx_3^2 - c^2 dt^2$
so

$dx_1^2 + dx_2^2 + dx_3^2 = c^2 dt^2.$
This null dual-cone represents the "line of sight" of a point in space. That is, when we look at the stars and say "The light from that star which I am receiving is X years old", we are looking down this line of sight: a null geodesic. We are looking at an event a distance $d = \sqrt{x_1^2+x_2^2+x_3^2}$ away and a time d/c in the past. For this reason the null dual cone is also known as the 'light cone'. (The point in the lower left of the picture below represents the star, the origin represents the observer, and the line represents the null geodesic "line of sight".)
The cone in the −t region is the information that the point is 'receiving', while the cone in the +t section is the information that the point is 'sending'.
The geometry of Minkowski space can be depicted using Minkowski diagrams, which are useful also in understanding many of the thought-experiments in special relativity.
Note that, in 4d spacetime, the concept of the center of mass becomes more complicated, see center of mass (relativistic).

Physics in spacetime

Transformations of physical quantities between reference frames
Above, the Lorentz transformation for the time coordinate and three space coordinates illustrates that they are intertwined. This is true more generally: certain pairs of "timelike" and "spacelike" quantities naturally combine on equal footing under the same Lorentz transformation.
The Lorentz transformation in standard configuration above, i.e. for a boost in the x direction, can be recast into matrix form as follows:

$\begin{pmatrix} ct'\\ x'\\ y'\\ z' \end{pmatrix} = \begin{pmatrix} \gamma & -\beta\gamma & 0 & 0\\ -\beta\gamma & \gamma & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} ct\\ x\\ y\\ z \end{pmatrix} = \begin{pmatrix} \gamma ct- \gamma\beta x\\ \gamma x - \beta \gamma ct \\ y\\ z \end{pmatrix}.$
In Newtonian mechanics, quantities which have magnitude and direction are mathematically described as 3d vectors in Euclidean space, and in general they are parametrized by time. In special relativity, this notion is extended by adding the appropriate timelike quantity to a spacelike vector quantity, and we have 4d vectors, or "four vectors", in Minkowski spacetime. The components of vectors are written using tensor index notation, as this has numerous advantages. The notation makes it clear the equations are manifestly covariant under the Poincaré group, thus bypassing the tedious calculations to check this fact. In constructing such equations, we often find that equations previously thought to be unrelated are, in fact, closely connected being part of the same tensor equation. Recognizing other physical quantities as tensors simplifies their transformation laws. Throughout, upper indices (superscripts) are contravariant indices rather than exponents except when they indicate a square (this is should be clear from the context), and lower indices (subscripts) are covariant indices. For simplicity and consistency with the earlier equations, Cartesian coordinates will be used.
The simplest example of a four-vector is the position of an event in spacetime, which constitutes a timelike component ct and spacelike component x = (x, y, z), in a contravariant position four vector with components:

$X^\nu = (X^0, X^1, X^2, X^3)= (ct, x, y, z).$
where we define X⁰ = ct so that the time coordinate has the same dimension of distance as the other spatial dimensions; so that space and time are treated equally.^[43]^[44]^[45] Now the transformation of the contravariant components of the position 4-vector can be compactly written as:

$X^{\mu'}=\Lambda^{\mu'}{}_\nu X^\nu$
where there is an implied summation on ν from 0 to 3, and $\Lambda^{\mu'}{}_{\nu}$ is a matrix.
More generally, all contravariant components of a four-vector $T^\nu$ transform from one frame to another frame by a Lorentz transformation:

$T^{\mu'} = \Lambda^{\mu'}{}_{\nu} T^\nu$
Examples of other 4-vectors include the four-velocity U^μ, defined as the derivative of the position 4-vector with respect to proper time:

$U^\mu = \frac{dX^\mu}{d\tau} = \gamma(v)( c , v_x , v_y, v_z ) .$
where the Lorentz factor is:

$\gamma(v)= \frac{1}{\sqrt{1- (v/c)^2}} \,,\quad v^2 = v_x^2 + v_y^2 + v_z^2 \,.$
The relativistic energy $E = \gamma(v)mc^2$ and relativistic momentum $\mathbf{p} = \gamma(v)m \mathbf{v}$ of an object are respectively the timelike and spacelike components of a covariant four momentum vector:

$P_\nu = m U_\nu = m\gamma(v)(c,v_x,v_y,v_z)= (E/c,p_x,p_y,p_z).$
where m is the invariant mass.
The four-acceleration is the proper time derivative of 4-velocity:

$A^\mu = \frac{d U^\mu}{d\tau} \,.$
The transformation rules for three-dimensional velocities and accelerations are very awkward; even above in standard configuration the velocity equations are quite complicated owing to their non-linearity. On the other hand, the transformation of four-velocity and four-acceleration are simpler by means of the Lorentz transformation matrix.
The four-gradient of a scalar field φ transforms covariantly rather than contravariantly:

$\begin{pmatrix} \frac{1}{c}\frac{\partial \phi}{\partial t'} & \frac{\partial \phi}{\partial x'} & \frac{\partial \phi}{\partial y'} & \frac{\partial \phi}{\partial z'}\end{pmatrix} = \begin{pmatrix} \frac{1}{c}\frac{\partial \phi}{\partial t} & \frac{\partial \phi}{\partial x} & \frac{\partial \phi}{\partial y} & \frac{\partial \phi}{\partial z}\end{pmatrix}\begin{pmatrix} \gamma & -\beta\gamma & 0 & 0\\ -\beta\gamma & \gamma & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix} \,.$
that is:

$(\partial_{\mu'} \phi) = \Lambda_{\mu'}{}^{\nu} (\partial_\nu \phi)\,,\quad \partial_{\mu} \equiv \frac{\partial}{\partial x^{\mu}}\,.$
only in Cartesian coordinates. It's the covariant derivative which transforms in manifest covariance, in Cartesian coordinates this happens to reduce to the partial derivatives, but not in other coordinates.
More generally, the covariant components of a 4-vector transform according to the inverse Lorentz transformation:

$\Lambda_{\mu'}{}^{\nu} T^{\mu'} = T^\nu$
where $\Lambda_{\mu'}{}^{\nu}$ is the reciprocal matrix of $\Lambda^{\mu'}{}_{\nu}$ .
The postulates of special relativity constrain the exact form the Lorentz transformation matrices take.
More generally, most physical quantities are best described as (components of) tensors. So to transform from one frame to another, we use the well-known tensor transformation law^[46]

$T^{\alpha' \beta' \cdots \zeta'}_{\theta' \iota' \cdots \kappa'} = \Lambda^{\alpha'}{}_{\mu} \Lambda^{\beta'}{}_{\nu} \cdots \Lambda^{\zeta'}{}_{\rho} \Lambda_{\theta'}{}^{\sigma} \Lambda_{\iota'}{}^{\upsilon} \cdots \Lambda_{\kappa'}{}^{\phi} T^{\mu \nu \cdots \rho}_{\sigma \upsilon \cdots \phi}$
where $\Lambda_{\chi'}{}^{\psi}$ is the reciprocal matrix of $\Lambda^{\chi'}{}_{\psi}$ . All tensors transform by this rule.
An example of a four dimensional second order antisymmetric tensor is the relativistic angular momentum, which has six components: three are the classical angular momentum, and the other three are related to the boost of the center of mass of the system. The derivative of the relativistic angular momentum with respect to proper time is the relativistic torque, also second order antisymmetric tensor.
The electromagnetic field tensor is another second order antisymmetric tensor field, with six components: three for the electric field and another three for the magnetic field. There is also the stress–energy tensor for the electromagnetic field, namely the electromagnetic stress–energy tensor.

Metric
The metric tensor allows one to define the inner product of two vectors, which in turn allows one to assign a magnitude to the vector. Given the four-dimensional nature of spacetime the Minkowski metric η has components (valid in any inertial reference frame) which can be arranged in a 4 × 4 matrix:

$\eta_{\alpha\beta} = \begin{pmatrix} -1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{pmatrix}$
which is equal to its reciprocal, $\eta^{\alpha\beta}$ , in those frames. Throughout we use the signs as above, different authors use different conventions – see Minkowski metric alternative signs.
The Poincaré group is the most general group of transformations which preserves the Minkowski metric:

$\eta_{\alpha\beta} = \eta_{\mu'\nu'} \Lambda^{\mu'}{}_\alpha \Lambda^{\nu'}{}_\beta \!$
and this is the physical symmetry underlying special relativity.
The metric can be used for raising and lowering indices on vectors and tensors. Invariants can be constructed using the metric, the inner product of a 4-vector T with another 4-vector S is:

$T^{\alpha}S_{\alpha}=T^{\alpha}\eta_{\alpha\beta}S^{\beta} = T_{\alpha}\eta^{\alpha\beta}S_{\beta} = \text{invariant scalar}$
Invariant means that it takes the same value in all inertial frames, because it is a scalar (0 rank tensor), and so no Λ appears in its trivial transformation. The magnitude of the 4-vector T is the positive square root of the inner product with itself:

$|\mathbf{T}| = \sqrt{T^{\alpha}T_{\alpha}}$
One can extend this idea to tensors of higher order, for a second order tensor we can form the invariants:

$T^{\alpha}{}_{\alpha}\,,T^{\alpha}{}_{\beta}T^{\beta}{}_{\alpha}\,,T^{\alpha}{}_{\beta}T^{\beta}{}_{\gamma}T^{\gamma}{}_{\alpha} = \text{invariant scalars}\,,$
similarly for higher order tensors. Invariant expressions, particularly inner products of 4-vectors with themselves, provide equations that are useful for calculations, because one doesn't need to perform Lorentz transformations to determine the invariants.

Relativistic kinematics and invariance
The coordinate differentials transform also contravariantly:

$dX^{\mu'}=\Lambda^{\mu'}{}_\nu dX^\nu$
so the squared length of the differential of the position four-vector dX^μ constructed using

$d\mathbf{X}^2 = dX^\mu \,dX_\mu = \eta_{\mu\nu}\,dX^\mu \,dX^\nu = -(c dt)^2+(dx)^2+(dy)^2+(dz)^2\,$
is an invariant. Notice that when the line element dX² is negative that √−dX² is the differential of proper time, while when dX² is positive, √dX² is differential of the proper distance.
The 4-velocity U^μ has an invariant form:

${\mathbf U}^2 = \eta_{\nu\mu} U^\nu U^\mu = -c^2 \,,$
which means all velocity four-vectors have a magnitude of c. This is an expression of the fact that there is no such thing as being at coordinate rest in relativity: at the least, you are always moving forward through time. Differentiating the above equation by τ produces:

$2\eta_{\mu\nu}A^\mu U^\nu = 0.$
So in special relativity, the acceleration four-vector and the velocity four-vector are orthogonal.

Relativistic dynamics and invariance
The invariant magnitude of the momentum 4-vector generates the energy–momentum relation:

$\mathbf{P}^2 = \eta^{\mu\nu}P_\mu P_\nu = -(E/c)^2 + p^2 .$
We can work out what this invariant is by first arguing that, since it is a scalar, it doesn't matter which reference frame we calculate it, and then by transforming to a frame where the total momentum is zero.

$\mathbf{P}^2 = - (E_\mathrm{rest}/c)^2 = - (m c)^2 .$
We see that the rest energy is an independent invariant. A rest energy can be calculated even for particles and systems in motion, by translating to a frame in which momentum is zero.
The rest energy is related to the mass according to the celebrated equation discussed above:

$E_\mathrm{rest} = m c^2.$
Note that the mass of systems measured in their center of momentum frame (where total momentum is zero) is given by the total energy of the system in this frame. It may not be equal to the sum of individual system masses measured in other frames.
To use Newton's third law of motion, both forces must be defined as the rate of change of momentum with respect to the same time coordinate. That is, it requires the 3D force defined above. Unfortunately, there is no tensor in 4D which contains the components of the 3D force vector among its components.
If a particle is not traveling at c, one can transform the 3D force from the particle's co-moving reference frame into the observer's reference frame. This yields a 4-vector called the four-force. It is the rate of change of the above energy momentum four-vector with respect to proper time. The covariant version of the four-force is:

$F_\nu = \frac{d P_{\nu}}{d \tau} = m A_\nu$
In the rest frame of the object, the time component of the four force is zero unless the "invariant mass" of the object is changing (this requires a non-closed system in which energy/mass is being directly added or removed from the object) in which case it is the negative of that rate of change of mass, times c. In general, though, the components of the four force are not equal to the components of the three-force, because the three force is defined by the rate of change of momentum with respect to coordinate time, i.e. dp/dt while the four force is defined by the rate of change of momentum with respect to proper time, i.e. dp/dτ.
In a continuous medium, the 3D density of force combines with the density of power to form a covariant 4-vector. The spatial part is the result of dividing the force on a small cell (in 3-space) by the volume of that cell. The time component is −1/c times the power transferred to that cell divided by the volume of the cell. This will be used below in the section on electromagnetism.

Relativity and unifying electromagnetism

Main articles: Classical electromagnetism and special relativity and Covariant formulation of classical electromagnetism
Theoretical investigation in classical electromagnetism led to the discovery of wave propagation. Equations generalizing the electromagnetic effects found that finite propagation speed of the E and B fields required certain behaviors on charged particles. The general study of moving charges forms the Liénard–Wiechert potential, which is a step towards special relativity.
The Lorentz transformation of the electric field of a moving charge into a non-moving observer's reference frame results in the appearance of a mathematical term commonly called the magnetic field. Conversely, the magnetic field generated by a moving charge disappears and becomes a purely electrostatic field in a comoving frame of reference. Maxwell's equations are thus simply an empirical fit to special relativistic effects in a classical model of the Universe. As electric and magnetic fields are reference frame dependent and thus intertwined, one speaks of electromagnetic fields. Special relativity provides the transformation rules for how an electromagnetic field in one inertial frame appears in another inertial frame.
Maxwell's equations in the 3D form are already consistent with the physical content of special relativity, although they are easier to manipulate in a manifestly covariant form, i.e. in the language of tensor calculus.^[47] See main links for more detail.

Status

Main articles: Tests of special relativity and Criticism of relativity theory
Special relativity in its Minkowski spacetime is accurate only when the absolute value of the gravitational potential is much less than c² in the region of interest.^[48] In a strong gravitational field, one must use general relativity. General relativity becomes special relativity at the limit of weak field. At very small scales, such as at the Planck length and below, quantum effects must be taken into consideration resulting in quantum gravity. However, at macroscopic scales and in the absence of strong gravitational fields, special relativity is experimentally tested to extremely high degree of accuracy (10⁻²⁰)^[49] and thus accepted by the physics community. Experimental results which appear to contradict it are not reproducible and are thus widely believed to be due to experimental errors.
Special relativity is mathematically self-consistent, and it is an organic part of all modern physical theories, most notably quantum field theory, string theory, and general relativity (in the limiting case of negligible gravitational fields).
Newtonian mechanics mathematically follows from special relativity at small velocities (compared to the speed of light) – thus Newtonian mechanics can be considered as a special relativity of slow moving bodies. See classical mechanics for a more detailed discussion.
Several experiments predating Einstein's 1905 paper are now interpreted as evidence for relativity. Of these it is known Einstein was aware of the Fizeau experiment before 1905,^[50] and historians have concluded that Einstein was at least aware of the Michelson–Morley experiment as early as 1899 despite claims he made in his later years that it played no role in his development of the theory.^[17]

The Fizeau experiment (1851, repeated by Michelson and Morley in 1886) measured the speed of light in moving media, with results that are consistent with relativistic addition of colinear velocities.

The famous Michelson–Morley experiment (1881, 1887) gave further support to the postulate that detecting an absolute reference velocity was not achievable. It should be stated here that, contrary to many alternative claims, it said little about the invariance of the speed of light with respect to the source and observer's velocity, as both source and observer were travelling together at the same velocity at all times.

The Trouton–Noble experiment (1903) showed that the torque on a capacitor is independent of position and inertial reference frame.

The Experiments of Rayleigh and Brace (1902, 1904) showed that length contraction doesn't lead to birefringence for a co-moving observer, in accordance with the relativity principle.

Particle accelerators routinely accelerate and measure the properties of particles moving at near the speed of light, where their behavior is completely consistent with relativity theory and inconsistent with the earlier Newtonian mechanics. These machines would simply not work if they were not engineered according to relativistic principles. In addition, a considerable number of modern experiments have been conducted to test special relativity. Some examples:

Tests of relativistic energy and momentum – testing the limiting speed of particles

Ives–Stilwell experiment – testing relativistic Doppler effect and time dilation

Time dilation of moving particles – relativistic effects on a fast-moving particle's half-life

Kennedy–Thorndike experiment – time dilation in accordance with Lorentz transformations

Hughes–Drever experiment – testing isotropy of space and mass

Modern searches for Lorentz violation – various modern tests

Experiments to test emission theory demonstrated that the speed of light is independent of the speed of the emitter.

Experiments to test the aether drag hypothesis – no "aether flow obstruction".

Theories of relativity and quantum mechanics
Special relativity can be combined with quantum mechanics to form relativistic quantum mechanics. It is an unsolved problem in physics how general relativity and quantum mechanics can be unified; quantum gravity and a "theory of everything", which require such a unification, are active and ongoing areas in theoretical research.
The early Bohr–Sommerfeld atomic model explained the fine structure of alkali metal atoms using both special relativity and the preliminary knowledge on quantum mechanics of the time.^[51]
In 1928, Paul Dirac constructed an influential relativistic wave equation, now known as the Dirac equation in his honour,^[52] that is fully compatible both with special relativity and with the final version of quantum theory existing after 1926. This equation explained not only the intrinsic angular momentum of the electrons called spin, it also led to the prediction of the antiparticle of the electron (the positron),^[52]^[53] and fine structure could only be fully explained with special relativity. It was the first foundation of relativistic quantum mechanics. In non-relativistic quantum mechanics, spin is phenomenological and cannot be explained.
On the other hand, the existence of antiparticles leads to the conclusion that relativistic quantum mechanics is not enough for a more accurate and complete theory of particle interactions. Instead, a theory of particles interpreted as quantized fields, called quantum field theory, becomes necessary; in which particles can be created and destroyed throughout space and time.

combined waveform
wave 1
wave 2
	Two waves in phase	Two waves 180° out of phase

“	Reflections of this type made it clear to me as long ago as shortly after 1900, i.e., shortly after Planck's trailblazing work, that neither mechanics nor electrodynamics could (except in limiting cases) claim exact validity. Gradually I despaired of the possibility of discovering the true laws by means of constructive efforts based on known facts. The longer and the more desperately I tried, the more I came to the conviction that only the discovery of a universal formal principle could lead us to assured results... How, then, could such a universal principle be found?	”
—Albert Einstein: Autobiographical Notes^[7]